11g新特性:A useful View V$DIAG_INFO

在11g中引入了自动诊断资料档案库(ADR)特性,默认情况下各种trace,dump存放的目录位置区别于9i/10g显得更加难以查找了。

ADR 基目录中可以包含多个 ADR 主目录,其中每个 ADR 主目录都是一个根目录,用于存放特定 Oracle 产品或组件的特定实例的全部诊断数据。前一张幻灯片的图形中显示了数据库的 ADR 主目录位置。

另外,还生成了两个预警文件。一个是文本形式的预警文件(与早期版本 Oracle DB 使用的预警文件非常相似),位于各个 ADR 主目录的 TRACE 目录下。还有一个符合 XML 标准的预警消息文件,存储在 ADR 主目录内的 ALERT 子目录下。可使用 Enterprise Manager 和 ADRCI 实用程序查看文本格式的预警日志(已删除了 XML 标记)。

此幻灯片中的图形显示了 ADR 主目录的目录结构。INCIDENT 目录包含多个子目录, 每个子目录均以特定意外事件命名,并且仅包含与该意外事件相关的转储。

HM 目录包含由健康状况监视器生成的检查器运行报告。

还有一个 METADATA 目录,其中包含资料档案库自身的重要文件。可以将此目录比作数据库字典。可使用 ADRCI 查询此字典。

ADR 命令解释器 (ADRCI) 是一个实用程序,可用于执行支持工作台允许的所有任务(但是仅限于在命令行环境中)。使用 ADRCI 实用程序,您还可以查看 ADR 中跟踪文件的名称以及删除了 XML 标记、具有和不具有内容筛选功能的预警日志。

此外,还可以使用 V$DIAG_INFO 列出一些重要的 ADR 位置。

自动诊断资料档案库 (ADR)的主要目录结构:

1.Server Directory Structure

adr1

2.Client Directory Structure

adr2

一个需要注意的细节是启用了ADR自动诊断资料档案库后,LISTENER监听器日志的默认位置也被移动到diagnostic_dest下了,而不在如9i/10g那样存放在$ORACLE_HOME/network/log目录下,有不少人因为忘记了这个细节而花费了大量时间去寻找listener.log,这个人也包括我。实际上我们还是可以将listener.log日志的位置还原回10g的形式,这一点可以通过修改$ORACLE_HOME/network/admin/listener.ora来达成:

[oracle@rh2 ~]$ cd $ORACLE_HOME/network/admin

[oracle@rh2 admin]$ cat listener.ora
# listener.ora Network Configuration File: /s01/oracle/product/11.2.0/dbhome_1/network/admin/listener.ora
# Generated by Oracle configuration tools.

LISTENER =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = rh2.oracle.com)(PORT = 1521))
  )

ADR_BASE_LISTENER = /s01/orabase

将以上listener.ora文件的ADR_BASE_LISTENER条目删除,并加上

DIAG_ADR_ENABLED_LISTENER = OFF

[oracle@rh2 log]$ lsnrctl reload

[oracle@rh2 admin]$ tnsping PROD
TNS Ping Utility for Linux: Version 11.2.0.2.0 - Production on 30-MAY-2011 21:39:21
Copyright (c) 1997, 2010, Oracle.  All rights reserved.
Used parameter files:
/s01/oracle/product/11.2.0/dbhome_1/network/admin/sqlnet.ora
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = rh-cluster-scan)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = PROD)))
OK (0 msec)

[oracle@rh2 log]$ cat $ORACLE_HOME/network/log/listener.log

TNSLSNR for Linux: Version 11.2.0.2.0 - Production on 30-MAY-2011 21:42:52

Copyright (c) 1991, 2010, Oracle.  All rights reserved.

System parameter file is /s01/oracle/product/11.2.0/dbhome_1/network/admin/listener.ora
Log messages written to /s01/oracle/product/11.2.0/dbhome_1/network/log/listener.log
Trace information written to /s01/oracle/product/11.2.0/dbhome_1/network/trace/listener.trc
Trace level is currently 6
Started with pid=24331
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=rh2)(PORT=1521)))
Listener completed notification to CRS on start

TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
30-MAY-2011 21:42:53 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=rh2.oracle.com)(USER=oracle))(COMMAND=status)(ARGUMENTS=64)(SERVICE=LISTENER)(VERSION=186647040)) * status * 0
Error listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.1.122)(PORT=1521)))
30-MAY-2011 21:42:59 * service_register * PROD1 * 12542
TNS-12542: TNS:address already in use
 TNS-12560: TNS:protocol adapter error
  TNS-00512: Address already in use
   Linux Error: 98: Address already in use
Error listening on: (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.1.133)(PORT=1521)))
30-MAY-2011 21:42:59 * service_register * PROD1 * 12542
TNS-12542: TNS:address already in use
 TNS-12560: TNS:protocol adapter error
  TNS-00512: Address already in use
   Linux Error: 98: Address already in use
30-MAY-2011 21:43:02 * service_update * PROD1 * 0
30-MAY-2011 21:43:02 * service_update * PROD1 * 0
30-MAY-2011 21:43:37 * service_register * G10R2 * 0
Error listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.1.122)(PORT=1521)))
30-MAY-2011 21:43:39 * service_register * +ASM1 * 12542
TNS-12542: TNS:address already in use
 TNS-12560: TNS:protocol adapter error
  TNS-00512: Address already in use
   Linux Error: 98: Address already in use

从以上ADR目录结构图中可以看到整个架构变得更复杂了,实际使用中的ADR档案资料库的复杂度要更高:

[grid@rh2 raw]$ tree -d  /s01/orabase/diag
/s01/orabase/diag
|-- asm
|   `-- +asm
|       `-- +ASM1
|           |-- alert
|           |-- cdump
|           |-- hm
|           |-- incident
|           |   |-- incdir_4897
|           |   |-- incdir_4898
|           |   |-- incdir_4899
|           |   |-- incdir_4900
|           |   |-- incdir_4901
|           |   |-- incdir_4902
|           |   `-- incdir_4903
|           |-- incpkg
|           |-- ir
|           |-- lck
|           |-- metadata
|           |-- metadata_dgif
|           |-- metadata_pv
|           |-- stage
|           |-- sweep
|           `-- trace
|               |-- cdmp_20110424212230
|               |-- cdmp_20110424214914
|               |-- cdmp_20110424220022
|               |-- cdmp_20110424221131
|               |-- cdmp_20110424222254
|               |-- cdmp_20110424223546
|               `-- cdmp_20110427154653
|-- clients
|   `-- user_oracle
|       `-- host_3070836769_80
|           |-- alert
|           |-- cdump
|           |-- incident
|           |-- incpkg
|           |-- lck
|           |-- metadata
|           |-- metadata_dgif
|           |-- metadata_pv
|           |-- stage
|           |-- sweep
|           `-- trace
|-- crs
|-- diagtool
|-- lsnrctl
|-- netcman
|-- ofm
|-- rdbms
|   |-- maclean
|   |   `-- MACLEAN
|   |       |-- alert
|   |       |-- cdump
|   |       |-- hm
|   |       |-- incident
|   |       |-- incpkg
|   |       |-- ir
|   |       |-- lck
|   |       |-- metadata
|   |       |-- metadata_dgif
|   |       |-- metadata_pv
|   |       |-- stage
|   |       |-- sweep
|   |       `-- trace
|   `-- prod
|       `-- PROD1
|           |-- alert
|           |-- cdump
|           |   |-- core_27975
|           |   |-- core_28006
|           |   |-- core_28013
|           |   `-- core_28034
|           |-- hm
|           |-- incident
|           |   |-- incdir_100831
|           |   |-- incdir_104831
|           |   |-- incdir_108831
|           |   |-- incdir_18201
|           |   |-- incdir_18202
|           |   |-- incdir_18266
|           |   |-- incdir_18361
|           |   |-- incdir_18362
|           |   |-- incdir_18369
|           |   |-- incdir_18385
|           |   |-- incdir_18387
|           |   |-- incdir_18393
|           |   |-- incdir_18402
|           |   |-- incdir_18410
|           |   |-- incdir_18411
|           |   |-- incdir_18441
|           |   |-- incdir_18457
|           |   |-- incdir_18458
|           |   |-- incdir_18465
|           |   |-- incdir_19737
|           |   |-- incdir_19738
|           |   |-- incdir_19739
|           |   |-- incdir_19745
|           |   |-- incdir_19746
|           |   |-- incdir_20589
|           |   |-- incdir_20701
|           |   |-- incdir_20736
|           |   |-- incdir_20737
|           |   |-- incdir_20738
|           |   |-- incdir_20739
|           |   |-- incdir_20740
|           |   |-- incdir_20957
|           |   |-- incdir_20973
|           |   |-- incdir_20989
|           |   |-- incdir_21005
|           |   |-- incdir_21778
|           |   |-- incdir_21936
|           |   |-- incdir_21937
|           |   |-- incdir_21938
|           |   |-- incdir_22200
|           |   |-- incdir_22201
|           |   |-- incdir_22216
|           |   |-- incdir_22232
|           |   |-- incdir_22233
|           |   |-- incdir_23306
|           |   |-- incdir_23506
|           |   |-- incdir_28123
|           |   |-- incdir_28147
|           |   |-- incdir_28148
|           |   |-- incdir_28227
|           |   |-- incdir_28228
|           |   |-- incdir_28229
|           |   |-- incdir_28230
|           |   |-- incdir_28235
|           |   |-- incdir_29323
|           |   |-- incdir_29324
|           |   |-- incdir_29325
|           |   |-- incdir_29326
|           |   |-- incdir_29327
|           |   |-- incdir_30681
|           |   |-- incdir_30682
|           |   |-- incdir_30683
|           |   |-- incdir_31724
|           |   |-- incdir_35253
|           |   |-- incdir_36453
|           |   |-- incdir_37653
|           |   |-- incdir_38853
|           |   |-- incdir_40053
|           |   |-- incdir_41253
|           |   |-- incdir_42453
|           |   |-- incdir_43653
|           |   |-- incdir_44853
|           |   |-- incdir_46053
|           |   |-- incdir_47253
|           |   |-- incdir_48453
|           |   |-- incdir_49653
|           |   |-- incdir_54525
|           |   |-- incdir_56918
|           |   |-- incdir_89183
|           |   |-- incdir_89184
|           |   |-- incdir_89185
|           |   |-- incdir_89186
|           |   |-- incdir_89187
|           |   |-- incdir_89327
|           |   |-- incdir_89343
|           |   |-- incdir_89351
|           |   |-- incdir_89359
|           |   |-- incdir_90270
|           |   |-- incdir_90271
|           |   |-- incdir_90272
|           |   |-- incdir_90273
|           |   |-- incdir_90274
|           |   |-- incdir_90275
|           |   `-- incdir_96831
|           |-- incpkg
|           |   `-- pkg_1
|           |       `-- seq_1
|           |           |-- crs
|           |           `-- export
|           |-- ir
|           |-- lck
|           |-- metadata
|           |-- metadata_dgif
|           |-- metadata_pv
|           |-- stage
|           |-- sweep
|           `-- trace
|               |-- cdmp_20110502214850
|               |-- cdmp_20110502214947
|               |-- cdmp_20110502221010
|               |-- cdmp_20110502221029
|               |-- cdmp_20110502221204
|               |-- cdmp_20110502221221
|               |-- cdmp_20110502221257
|               |-- cdmp_20110502221318
|               |-- cdmp_20110502221450
|               |-- cdmp_20110502221505
|               |-- cdmp_20110502222225
|               |-- cdmp_20110502222315
|               |-- cdmp_20110502222402
|               |-- cdmp_20110502224708
|               |-- cdmp_20110502230815
|               |-- cdmp_20110503202436
|               |-- cdmp_20110503202521
|               |-- cdmp_20110509231250
|               |-- cdmp_20110512171047
|               |-- cdmp_20110512171231
|               |-- cdmp_20110512171437
|               |-- cdmp_20110512171635
|               |-- cdmp_20110520165537
|               |-- cdmp_20110520232601
|               |-- cdmp_20110520234343
|               |-- cdmp_20110521000119
|               |-- cdmp_20110521000218
|               |-- cdmp_20110521065437
|               |-- cdmp_20110521065455
|               |-- cdmp_20110521065512
|               |-- cdmp_20110521074224
|               |-- cdmp_20110521074737
|               |-- cdmp_20110521075206
|               |-- cdmp_20110521080019
|               |-- cdmp_20110521080642
|               |-- cdmp_20110521140052
|               |-- cdmp_20110530202441
|               |-- cdmp_20110530202442
|               |-- cdmp_20110530202846
|               |-- cdmp_20110530202848
|               |-- cdmp_20110530202916
|               |-- cdmp_20110530202918
|               |-- cdmp_20110530203156
|               `-- cdmp_20110530203158
`-- tnslsnr
    `-- rh2
        `-- listener
            |-- alert
            |-- cdump
            |-- incident
            |-- incpkg
            |-- lck
            |-- metadata
            |-- metadata_dgif
            |-- metadata_pv
            |-- stage
            |-- sweep
            `-- trace

240 directories

所幸的是在11g中提供了比传统的gettrcname.sql脚本更为给力的诊断文件位置信息汇总的视图V$DIAG_INFO:

V$DIAG_INFO 视图列出了所有重要的 ADR 位置:
ADR Base:ADR 基目录的路径
ADR Home:当前数据库实例的 ADR 主目录的路径
Diag Trace:文本预警日志和后台/前台进程跟踪文件的位置
Diag Alert:XML 版本的预警日志的位置
Default Trace File:会话的跟踪文件的路径。SQL 跟踪文件将写入到这里。
Health Monitor: 健康检查报告所在目录
Active Problem Count:当前激活的问题总数
Active Incident Count  当前激活的事故总数

SQL>  select name,value from v$diag_info;

NAME                                    VALUE
--------------------------------------- ------------------------------------------------------------
Diag Enabled                            TRUE
ADR Base                                /s01/orabase
ADR Home                                /s01/orabase/diag/rdbms/prod/PROD1
Diag Trace                              /s01/orabase/diag/rdbms/prod/PROD1/trace
Diag Alert                              /s01/orabase/diag/rdbms/prod/PROD1/alert
Diag Incident                           /s01/orabase/diag/rdbms/prod/PROD1/incident
Diag Cdump                              /s01/orabase/diag/rdbms/prod/PROD1/cdump
Health Monitor                          /s01/orabase/diag/rdbms/prod/PROD1/hm
Default Trace File                      /s01/orabase/diag/rdbms/prod/PROD1/trace/PROD1_ora_22893.trc
Active Problem Count                    7
Active Incident Count                   373

11 rows selected.

adrci> show hm_run
**********************************************************
HM RUN RECORD 119
**********************************************************
   RUN_ID                        2481
   RUN_NAME                      HM_RUN_2481
   CHECK_NAME                    DB Structure Integrity Check
   NAME_ID                       2
   MODE                          2
   START_TIME                    2011-05-21 20:11:38.612669 +08:00
   RESUME_TIME
   END_TIME                      2011-05-21 20:11:38.619530 +08:00
   MODIFIED_TIME                 2011-05-21 20:11:38.619530 +08:00
   TIMEOUT                       0
   FLAGS                         0
   STATUS                        5
   SRC_INCIDENT_ID               0
   NUM_INCIDENTS                 0
   ERR_NUMBER                    0
   REPORT_FILE                   

adrci> create report hm_run HM_RUN_2481

adrci> show hm_run  -p "RUN_ID=2481"

ADR Home = /s01/orabase/diag/rdbms/prod/PROD1:
*************************************************************************

**********************************************************
HM RUN RECORD 1
**********************************************************
   RUN_ID                        2481
   RUN_NAME                      HM_RUN_2481
   CHECK_NAME                    DB Structure Integrity Check
   NAME_ID                       2
   MODE                          2
   START_TIME                    2011-05-21 20:11:38.612669 +08:00
   RESUME_TIME
   END_TIME                      2011-05-21 20:11:38.619530 +08:00
   MODIFIED_TIME                 2011-05-30 21:09:43.071150 +08:00
   TIMEOUT                       0
   FLAGS                         0
   STATUS                        5
   SRC_INCIDENT_ID               0
   NUM_INCIDENTS                 0
   ERR_NUMBER                    0
   REPORT_FILE                   /s01/orabase/diag/rdbms/prod/PROD1/hm/HMREPORT_HM_RUN_2481.hm

[oracle@rh2 ~]$ cd /s01/orabase/diag/rdbms/prod/PROD1/hm

[oracle@rh2 hm]$ cat HMREPORT_HM_RUN_2481.hm

<?xml version="1.0" encoding="US-ASCII"?>
<HM-REPORT REPORT_ID="HM_RUN_2481">
 <TITLE>HM Report: HM_RUN_2481</TITLE>
 <RUN_INFO>
 <CHECK_NAME>DB Structure Integrity Check</CHECK_NAME>
 <RUN_ID>2481</RUN_ID>
 <RUN_NAME>HM_RUN_2481</RUN_NAME>
 <RUN_MODE>REACTIVE</RUN_MODE>
 <RUN_STATUS>COMPLETED</RUN_STATUS>
 <RUN_ERROR_NUM>0</RUN_ERROR_NUM>
 <SOURCE_INCIDENT_ID>0</SOURCE_INCIDENT_ID>
 <NUM_INCIDENTS_CREATED>0</NUM_INCIDENTS_CREATED>
 <RUN_START_TIME>2011-05-21 20:11:38.612669 +08:00</RUN_START_TIME>
 <RUN_END_TIME>2011-05-21 20:11:38.619530 +08:00</RUN_END_TIME>
 </RUN_INFO>
 <RUN_PARAMETERS/>
 <RUN-FINDINGS/>
</HM-REPORT>

通过查询V$diag_info可以很容易找到自身服务进程的trace文件位置,对于其他进程的trace文件则可以查询v$process新加入的tracefile列:

SQL> select spid,tracefile from v$process;

SPID                     TRACEFILE
------------------------ --------------------------------------------------------------------------------
                         /s01/orabase/diag/rdbms/prod/PROD1/trace/PROD1_ora_0.trc
22789                    /s01/orabase/diag/rdbms/prod/PROD1/trace/PROD1_pmon_22789.trc
22791                    /s01/orabase/diag/rdbms/prod/PROD1/trace/PROD1_psp0_22791.trc

How does SGA/PGA allocate on AMM?

Oracle 11g中引入了革命性的Automatic Memory Management(AMM)特性,通过该特性DBA只需要为Instance指定一个参数(memory_target),数据库软件就会根据SGA/PGA内存的实际使用统计信息来调优SGA/PGA内存区域的大小。从技术上说这是一个很cool的特性,可以说是Oracle所提倡的self_tuned即自身调优数据库软件大成的一个标志。但从另一方面来看,AMM也会给我们带来不少问题和困惑,DBA需要面对更多黑盒内隐藏的秘密了。

虽然我们无法彻底了解AMM的所有细节,但有一些关键性问题肯定会在我们使用AMM的过程造成许多的不确定性,在此我列出部分问题及其解答。

Question1:在使用AMM特性时,即设置了memory_target的情况下sga/pga内存区域默认各占百分之多少?

Answer:
这个问题存在多种不同的情景:

1)设置memory_target的同时,设置了sga_target及pga_aggregate_target参数:这种情况下设置的sga_target与pga_aggregate_target之和不能大于memory_target;默认sga大小等同于sga_target设定的值,可以从V$memory_Dynamic_Components视图中查询到pga target的current size可能远大于pga_aggregate_target(这里查询出来的current size仅代表pga的一个目标值,不是pga当前实际占用的内存),一般来说这里的pga current size等于(memory_target-sga_target),显然pga target的最小值会是pga_aggregate_target,而sga的最小值为sga_target.在此前提下sga/pga内存区域有较小的灵活性,实际上仅当memory_target>sga_target+pga_aggregate_target的情况下,sga/pga才可能发生扩展(grow)和收缩(shrink)

2)设置memory_target的同时,设置了sga_target而未设置pga_aggregate_target参数:这种情况下Oracle会自动调优以上2个参数,pga_aggregate_target的初始值为(memory_target-sga_target),注意这里说的是初始值;这种情况下因为没有硬性地设置pga_aggregate_target,故而sga/pga内存区域有较大的灵活性去扩展(grow)和收缩(shrink)。更倾向于sga的扩展,当然这并不绝对。

3)设置memory_target的同时,设置了pga_aggregate_target而未设置sga_target:这种情况下Oracle会自动调优以上2个参数,sga_target的初始值为min(memory_target-pga_aggregate_target,sga_max_size),注意这里说的是初始值;类似于以上的情况因为没有硬性地设置sga_target,故而sga/pga内存区域有较大的灵活性去扩展(grow)和收缩(shrink)。更倾向于pga的扩展,当然这并不绝对。

4)仅仅设置了memory_target参数,Oracle会自动调优sga/pga内存区域,且没有任何最小值或默认值的约束。当在初始化分配操作系统内存的时候存在一个分配pga/sga的策略;该策略会在实例启动阶段(instance startup)为SGA分配memory_target 60%的内存(该比例受到隐藏参数_memory_initial_sga_split_perc Initial default sga target percentage with memory target的影响,该参数默认为60),而为PGA分配40%的内存。这种情况下sga/pga内存区域的游标最为灵活,可以不受限制地上下浮动,是真正意义上的AMM,但也最为危险!

5)没有设置memory_target参数,该情况下Oracle 11g的表现与10g中”一致”(并非完全一致,黑盒中藏了很多秘密,只是看上去这样,后面会提到)

Question2:”我在10g就感到ASMM很不好用,产生了很多问题;我是个保守派,对11g中的AMM一定也不感兴趣。我要彻底禁用AMM和ASMM,在11g中是否只要设置sga_target和memory_target参数为零就ok了?”

Answer:
在11.2.0.1之前的版本(包括11.1.0.7,11.1.0.6等)我们仅需要设置sga_target和memory_target参数为零就可以避免sga/pga内存区域的resize,但在Oracle 11g Release 2中引入了immediate memory allocation requests立即内存分配的特性,该特性会在automatic memory management被禁用的情况下始终生效,引入这一特性的直接目的是尽可能的避免ORA-04031错误的发生,一般来说该特性更多地表现出了其积极的一面,但如果你实际需要的是一块安宁的、如死水寂静的sga/pga内存区域的话,我们还是有办法禁用该immediate memory allocation requests特性的,这样可以彻底杜绝sga中内存组建resize的发生:

SQL> col DESCRIB for a60
SQL> set linesize 200 pagesize 2000
SQL> col name for a32
SQL> col value for a10

SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE, x.ksppdesc describ
  2   FROM SYS.x$ksppi x, SYS.x$ksppcv y
  3   WHERE x.inst_id = USERENV ('Instance')
  4   AND y.inst_id = USERENV ('Instance')
  5   AND x.indx = y.indx
  6  AND x.ksppinm LIKE '%_memory_imm_mode_without_autosga%';

NAME                             VALUE      DESCRIB
-------------------------------- ---------- ------------------------------------------------------------
_memory_imm_mode_without_autosga TRUE       Allow immediate mode without sga/memory target

SQL> alter system set "_memory_imm_mode_without_autosga"=FALSE scope=both;
System altered.

/* 修改该参数需要重启实例 !*/

Question3:”我是一个激进派,我喜欢Oracle self-tuned自动调优的种种特性,为了充分利用AMM的优势,我决定仅设置memory_target参数,而不设置sga_target和pga_aggregate_target参数,让Oracle自行去调优这2个参数。但是我有担心如果pga扩展过头,是不是可能造成sga频繁shrink,造成系统颠簸甚至实例hang住?”

Answer:你的担心是完全有理由的,已知的文档604080.1指出了在11.1.0.6到11.1.0.7版本中存在AMM下未设置PGA_AGGREGATE_TARGET导致Oracle所使用的总内存超过MEMORY_MAX_TARGET的bug 6346293(When not setting PGA_AGGREGATE_TARGET to an explicit value, MEMORY_MAX_TARGET can be exceeded)。该Bug在11.2以上版本中得到了修复。

显然通过R1多个版本的历练后,Oracle充分认识到了AMM”自由调度”所可能带来的问题,我相信Oracle已经得到了某些内存阀值的黄金比例(Another magic number),当然这些幻数隐藏得很深(可能由某个_memory开头的隐藏参数控制着,但我没有兴致去研究,这需要花费太多的时间)。通过这些经典阀值Oracle可以有效控制pga/sga在不过度扩展或收缩的情况下仍保留其灵活性。

针对那些和我一样热爱新特性的朋友,最好的建议可能是在设置memory_target的同时为sga_target、pga_aggregate_target设置最小的期望值:

Memory_Max_Target>=Memory_Target>>(Sga_Target+Pga_Aggregate_Target)

这样可以避免由AMM过于活跃所造成的系统颠簸,而又不失去其灵活性。

to be continued………….

解决Linux上11g的ORA-00845错误

下午想把一个11.2.0.2实例的memory_target AMM内存初始化参数修改到2000M,改好后重启发现实例起不来了,出现了ORA-00845错误:

[oracle@rh2 ~]$ 
[oracle@rh2 ~]$ sqlplus  / as sysdba

SQL*Plus: Release 11.2.0.2.0 Production on Tue May 3 19:43:50 2011

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount;
ORA-00845: MEMORY_TARGET not supported on this system

======================= ALERT.LOG======================

Starting ORACLE instance (normal)
WARNING: You are trying to use the MEMORY_TARGET feature. 
This feature requires the /dev/shm file system to be mounted for at least 2097152000 bytes. 
/dev/shm is either not mounted or is mounted with available space less than this size. 
Please fix this so that MEMORY_TARGET can work as expected. 
Current available is 1871466496 and used is 200790016 bytes. 
Ensure that the mount point is /dev/shm for this directory.
memory_target needs larger /dev/shm

看起来是因为采用了Automatic Memory Management的特性,所以Oracle采用了一种基于POSIX的共享内存使用风格,在AMM中Oracle不像之前版本那样利用”一整块”内存作为SGA,而是使用一块块小的”chunk”,以满足SGA与PGA之前交换内存的目的,具体可以参考下图:
AMM1

针对上述ORA-00845错误,我们需要修改映射内存文件夹/dev/shm的装载参数,针对正在运行的Linux操作系统,可以remount这个tmpfs装载点,如:

[root@rh2 ~]# umount /dev/shm
umount: /dev/shm: device is busy
umount: /dev/shm: device is busy

[root@rh2 ~]# ls /dev/shm
JOXSHM_EXT_0_PROD1_8323079   JOXSHM_EXT_25_PROD1_8323079  ora_+ASM1_7569414_11  
JOXSHM_EXT_10_PROD1_8323079  JOXSHM_EXT_26_PROD1_8323079  ora_+ASM1_7569414_12  
............

/* 可以看到如果ASM实例也使用AMM的话同样会在/dev/shm目录下产生内存映像 
    为了umount该装载点,我们不得不首先shutdown ASM instance ! */

[root@rh2 ~]# su - grid
[grid@rh2 ~]$ sqlplus  / as sysdba

SQL*Plus: Release 11.2.0.2.0 Production on Tue May 3 20:05:24 2011

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> shutdown immediate;
ORA-01031: insufficient privileges
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
[grid@rh2 ~]$ sqlplus  / as sysasm

SQL*Plus: Release 11.2.0.2.0 Production on Tue May 3 20:05:32 2011

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> shutdown immediate;
ORA-15097: cannot SHUTDOWN ASM instance with connected client (process 1576)
SQL> shutdown abort;
ASM instance shutdown
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

[root@rh2 ~]# umount /dev/shm

[root@rh2 ~]# mount -t tmpfs shmfs -o size=4000m /dev/shm

/* 这里的size指定了shm目录的装载大小,4000m可能对你的系统仍然不够,
    那么你有理由设一个更大的值  */


[root@rh2 ~]# mount|grep shmfs
shmfs on /dev/shm type tmpfs (rw,size=4000m)

Startup ASM...........

[root@rh2 ~]# su - oracle
[oracle@rh2 ~]$ sqlplus  / as sysdba

SQL*Plus: Release 11.2.0.2.0 Production on Tue May 3 20:09:40 2011

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount;
ORACLE instance started.

Total System Global Area 2087780352 bytes
Fixed Size                  2228072 bytes
Variable Size            1476395160 bytes
Database Buffers          603979776 bytes
Redo Buffers                5177344 bytes

/* 实例启动成功! */

上述umount/mount的方式只在操作系统的此次生命周期中生效,如果重启的话shmfs目录仍会以默认方式装载;如果需要永久生效的话,我们需要修改系统参数文件/etc/fstab中的shm条目:

/* 修改前的shm装载参数 */

[root@rh2 ~]# grep shm /etc/fstab 
tmpfs                   /dev/shm                tmpfs   defaults        0 0

/* 修改后的shm装载参数 */

[root@rh2 ~]# grep shm /etc/fstab 
tmpfs                   /dev/shm                tmpfs   rw,size=4000m   0 0

ORA-20001错误一例

一套11.1.0.7上的Oracle Application Object Library应用程序,在收集schema统计信息时出现了ORA-20001错误,具体错误日志如下:

SQL> exec fnd_stats.gather_schema_statistics('AP');

PL/SQL procedure successfully completed.

SQL> show error
No errors.
============================================
Concurrent request error Log
------------------------------------
**Starts**14-APR-2011 02:20:53
**Ends**14-APR-2011 04:40:43
ORA-0000: normal, successful completion
+---------------------------------------------------------------------------+
Start of log messages from FND_FILE
+---------------------------------------------------------------------------+
In GATHER_SCHEMA_STATS , schema_name= ALL percent= 10 degree = 8 internal_flag= NOBACKUP
stats on table AQ$_WF_CONTROL_P is locked 
stats on table FND_CP_GSM_IPC_AQTBL is locked 
stats on table WF_NOTIFICATION_OUT is locked 
Error #1: ERROR: While GATHER_TABLE_STATS:
object_name=AP.JE_FR_DAS_010***ORA-20001: invalid column name or 
duplicate columns/column groups/expressions in method_opt***
Error #2: ERROR: While GATHER_TABLE_STATS:
object_name=AP.JE_FR_DAS_010_NEW***ORA-20001: invalid column name or 
duplicate columns/column groups/expressions in method_opt***
Error #3: ERROR: While GATHER_TABLE_STATS:
object_name=AP.JG_ZZ_SYS_FORMATS_ALL_B***ORA-20001: invalid column name or 
duplicate columns/column groups/expressions in method_opt***
+---------------------------------------------------------------------------+
End of log messages from FND_FILE
+---------------------------------------------------------------------------+

经确认该问题由bug 7601966: GATHER SCHEMA STATS ON AP SCHEMA FAILS WITH ORA-20001: INVALID COLUMN NAME 引起,可以通过follow文档<Gather Schema Statistics fails with Ora-20001 errors after 11G database upgrade [ID 781813.1]>解决该问题。

Examine 11g automatic block Corruption recovery

11g的高可用框架中DataGuard为我们带来了大量有用的特性,最引入注目的显然是Active Data Guard,这一特性几乎彻底改观了Oracle HA的原有格局。除了Active Data Guard外Automatic Block Media Repair即自动的块介质恢复也是11g中数据卫士一个不容忽视的特色。该特性通过后台进程ABMR自动将物理备库(physical standby)上的健康数据块传输到主库(primary database)上以替换在主库发现的已损坏的数据块。同样的若物理备库上发现数据块损坏那么也可以利用到以上特性来修复。注意使用该特性无需额外设置db_lost_write_protect参数为非默认的NONE值,ABMR的自动修复不依赖于该参数。

深入研究的话可以发现实际控制Automatic Block Media Repair特性的是一系列隐藏参数,它们包括:

_auto_bmr enabled enable/disable Auto BMR
_auto_bmr_req_timeout 60 Auto BMR Requester Timeout
_auto_bmr_sess_threshold 30 Auto BMR Request Session Threshold
_auto_bmr_pub_timeout 10 Auto BMR Publish Timeout
_auto_bmr_fc_time 60 Auto BMR Flood Control Time
_auto_bmr_bg_time 3600 Auto BMR Process Run Time
_auto_bmr_sys_threshold 100 Auto BMR Request System Threshold

显然这里面_auto_bmr隐藏参数是ABMR特性的开关,其默认值为enabled,而其他参数则定义了abmr的超时和限定阀值,这里不做展开。

这里我们来实地体验一下这种高可用的block repair特性:

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE	11.2.0.2.0	Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

/* 演示所用数据库版本为较新的11.2.0.2 */

SQL> conn maclean/maclean
Connected.

SQL> create tablespace abmr datafile size 10M;
Tablespace created.

SQL> select file_name from dba_data_files where tablespace_name='ABMR';

FILE_NAME
--------------------------------------------------------------------------------
/standby/oradata/PROD/datafile/o1_mf_abmr_6rvqp087_.dbf

SQL>  create table need_repair tablespace abmr  as select 1 t1 from dual;
Table created.

SQL> exec dbms_stats.gather_table_stats('MACLEAN','NEED_REPAIR');
PL/SQL procedure successfully completed.

SQL> select * from need_repair;
	T1
----------
	 1

SQL>  select dbms_rowid.rowid_block_number(rowid) from need_repair;

DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------
				 131

SQL> alter system flush buffer_cache;
System altered.

SQL> alter system flush buffer_cache;
System altered.

[maclean@rh6 ~]$ bbed FILENAME=/standby/oradata/PROD/datafile/o1_mf_abmr_6rvqp087_.dbf mode=edit
Password: 

BBED: Release 2.0.0.0.0 - Limited Production on Sat Mar 26 20:42:25 2011

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

************* !!! For Oracle Internal Use only !!! ***************

BBED> set block 131
	BLOCK#         	131

BBED> corrupt block 131
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
Block marked media corrupt.

/* 以上我们使用bbed工具将need_repair表上唯一一行数据所在的数据块损坏 
   这样如果没有11g中automatic block Corruption recovery特性的话,
   应当报ORA-01578:ORACLE data block corrupted错误
*/

SQL> conn maclean/maclean
Connected.
SQL> select * from need_repair;

	T1
----------
	 1

/* 以上查询并未出错,显然已经通过后台调用ABMR进程修复了该数据块 */

/* 告警日志中记录了ABMR的修复过程 */

Hex dump of (file 5, block 131) in trace file /s01/diag/rdbms/prod/PROD/trace/PROD_ora_24289.trc
Corrupt block relative dba: 0x01400083 (file 5, block 131)
Bad header found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x00000083
 last change scn: 0x0000.00000000 seq: 0xff flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x000006ff
 check value in block header: 0x154d
 computed block checksum: 0x0
Reading datafile '/standby/oradata/PROD/datafile/o1_mf_abmr_6rvqp087_.dbf' for corruption at rdba: 0x01400083 (file 5, block 131)
Reread (file 5, block 131) found same corrupt data (no logical check)

/* 以上dedicated server process 2次从disk读取该块都发现损坏后,启动了后台进程ABMR,
   在实例启动时abmr进程并不随instance启动,仅当需要时被服务进程启动 
 */

Starting background process ABMR
ABMR started with pid=33, OS id=24293 
Automatic block media recovery service is active.
Automatic block media recovery requested for (file# 5, block# 131)
Automatic block media recovery successful for (file# 5, block# 131)
Automatic block media recovery successful for (file# 5, block# 131)
WARNING: AutoBMR fixed mismatched on-disk single block 83 with in-mem rdba 1400083.

/* 同时abmr不是fatal的后台进程,杀死该进程不会导致不良影响 */

[maclean@rh6 ~]$ ps -ef|grep abmr
maclean  24293     1  0 20:43 ?        00:00:00 ora_abmr_PROD
maclean  24390 22254  0 20:49 pts/0    00:00:00 grep abmr
[maclean@rh6 ~]$ kill -9 24293

/* 如果不想使用Automatic Block Media Repair特性,抑或者因为该特性出现了一些问题的话,
   设置_auto_bmr为diabled即禁用该特性一般可以帮助我们绕过问题 */


SQL> alter system set "_auto_bmr"=disabled;
System altered.

SQL> select * from need_repair;
select * from need_repair
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 5, block # 131)
ORA-01110: data file 5:
'/standby/oradata/PROD/datafile/o1_mf_abmr_6rvqp087_.dbf'

/* 如预期的出现了ORA-01578错误 */

相关的告警日志内容 :
Hex dump of (file 5, block 131) in trace file /s01/diag/rdbms/prod/PROD/trace/PROD_ora_24742.trc
Corrupt block relative dba: 0x01400083 (file 5, block 131)
Bad header found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x00000083
 last change scn: 0x0000.00000000 seq: 0xff flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x000006ff
 check value in block header: 0x154d
 computed block checksum: 0x0
Reading datafile '/standby/oradata/PROD/datafile/o1_mf_abmr_6rvqp087_.dbf' for corruption at rdba: 0x01400083 (file 5, block 131)
Reread (file 5, block 131) found same corrupt data (no logical check)
Corrupt Block Found
         TSN = 5, TSNAME = ABMR
         RFN = 5, BLK = 131, RDBA = 20971651
         OBJN = 13773, OBJD = 13773, OBJECT = NEED_REPAIR, SUBOBJECT = 
         SEGMENT OWNER = MACLEAN, SEGMENT TYPE = Table Segment
Errors in file /s01/diag/rdbms/prod/PROD/trace/PROD_ora_24742.trc  (incident=5081):
ORA-01578: ORACLE data block corrupted (file # 5, block # 131)
ORA-01110: data file 5: '/standby/oradata/PROD/datafile/o1_mf_abmr_6rvqp087_.dbf'
Incident details in: /s01/diag/rdbms/prod/PROD/incident/incdir_5081/PROD_ora_24742_i5081.trc
2011-03-26 21:08:18.718000 +08:00
Sweep [inc][5081]: completed
Hex dump of (file 5, block 131) in trace file /s01/diag/rdbms/prod/PROD/incident/incdir_5081/PROD_m000_24753_i5081_a.trc
Corrupt block relative dba: 0x01400083 (file 5, block 131)
Bad header found during validation
Data in bad block:
 type: 6 format: 2 rdba: 0x00000083
 last change scn: 0x0000.00000000 seq: 0xff flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x000006ff
 check value in block header: 0x154d
 computed block checksum: 0x0
Reread of blocknum=131, file=/standby/oradata/PROD/datafile/o1_mf_abmr_6rvqp087_.dbf. found same corrupt data
Reread of blocknum=131, file=/standby/oradata/PROD/datafile/o1_mf_abmr_6rvqp087_.dbf. found same corrupt data
Reread of blocknum=131, file=/standby/oradata/PROD/datafile/o1_mf_abmr_6rvqp087_.dbf. found same corrupt data
Reread of blocknum=131, file=/standby/oradata/PROD/datafile/o1_mf_abmr_6rvqp087_.dbf. found same corrupt data
Reread of blocknum=131, file=/standby/oradata/PROD/datafile/o1_mf_abmr_6rvqp087_.dbf. found same corrupt data
Checker run found 1 new persistent data failures
Dumping diagnostic data in directory=[cdmp_20110326210819], requested by (instance=1, osid=24742), summary=[incident=5081].
2011-03-26 21:08:21.458000 +08:00
Sweep [inc2][5081]: completed

to be continued …………

Duplicate standby database from active database

11g Release1 中引入了新的RMAN duplicate命令,即duplicate from active database命令。利用该命令可以更加便捷地创建Data Guard环境,你甚至不需要将Primary Database shutdown(整个过程中主库都可以处于打开状态下),也不需要在配置前做一些额外的备份操作,仅需要配置起auxiliary辅助实例,同时创建密码文件,并在监听(LISTENER)中加入静态注册信息后就可以开始工作了!

以下示例中我们会在Primary Database所在的同一台服务器上部署一套物理备库(Physical Standby),实际上这样比在不同的2台服务器上配置Data Guard要复杂些,原因是那样的话我们无需做File Name的转换,具体的环境:

Database Role DB_UNIQUE_NAME
Primary Database PROD
Standby Database SBDB

 

/* 需要注意的是DG环境中force logging,不要用了新特性就将这个基本的要求忘记了 ! */

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE	11.2.0.2.0	Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

/* 显然PROD主库应当处于归档模式并且是FORCE LOGGING状态下 */

SQL> alter database force logging;
Database altered.

同时在Primary Database上创建必要的standby logfile:

SQL> alter database add standby logfile group 7 size 50M;
Database altered.

SQL> alter database add standby logfile group 8 size 50M;
Database altered.
 ..............

/* standby实例只需要db_name参数就可以启动到nomount模式了,
   并为standby数据库创建密码文件 */

[maclean@rh6 ~]$ echo "db_name=PROD" >$ORACLE_HOME/dbs/initSBDB.ora

[maclean@rh6 ~]$ orapwd file=$ORACLE_HOME/dbs/orapwSBDB password=XXXX entries=10

[maclean@rh6 ~]$ cd $ORACLE_HOME/network/admin

我们需要保证PROD和SBDB实例在监听器Listener中被静态注册,同时也包括DGMGRL需要用到的*_DGMGRL服务名

[maclean@rh6 admin]$ cat listener.ora
# listener23920.ora Network Configuration File: /tmp/listener23920.ora
# Generated by Oracle configuration tools.

DGLSN =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = rh6.oracle.com)(PORT = 1588))
  )

SID_LIST_DGLSN =
  (SID_LIST =
    (SID_DESC =
      (GLOBAL_DBNAME = PROD)
      (ORACLE_HOME = /s01/product/11.2.0/dbhome_2)
      (SID_NAME = PROD)
    )
    (SID_DESC =
      (GLOBAL_DBNAME = SBDB)
      (ORACLE_HOME = /s01/product/11.2.0/dbhome_2)
      (SID_NAME = SBDB)
    )
    (SID_DESC =
      (GLOBAL_DBNAME = PROD_DGMGRL)
      (ORACLE_HOME = /s01/product/11.2.0/dbhome_2)
      (SID_NAME = PROD)
    )
    (SID_DESC =
      (GLOBAL_DBNAME = SBDB_DGMGRL)
      (ORACLE_HOME = /s01/product/11.2.0/dbhome_2)
      (SID_NAME = SBDB)
    )
  )

/* 并在tnsnames.ora文件中加入必要的服务别名 */

[maclean@rh6 admin]$ cat tnsnames.ora
# tnsnames.ora Network Configuration File: /s01/product/11.2.0/dbhome_2/network/admin/tnsnames.ora
# Generated by Oracle configuration tools.

PROD =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = rh6.oracle.com)(PORT = 1588))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = PROD)
    )
  )

SBDB =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = rh6.oracle.com)(PORT = 1588))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = SBDB)
    )
  )

DGLSN =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = rh6.oracle.com)(PORT = 1588))
  )

接着我们来启动辅助实例

[maclean@rh6 ~]$ export ORACLE_SID=SBDB
[maclean@rh6 ~]$ sqlplus  / as sysdba
SQL> startup  nomount;
ORACLE instance started.

/* 使用远程登录测试是否可以以SYSDBA登录SBDB实例 */

[maclean@rh6 ~]$ sqlplus  sys/XXXXXX@SBDB as sysdba

下面我们需要准备一份duplicate standby database的脚本,
因为我们是在同一台主机上配置Data Guard所以这里要用到db_file_name_convert和log_file_name_convert,
如果你的环境中不需要这么做那么你可以指定NOFILENAMECHECK;

以下为示例的脚本,可以看到它并不复杂只是指定了必要的初始化参数,十分易于记忆。

[maclean@rh6 ~]$ cat duplicate_act_standby.rcv 

duplicate target database
for standby
from active database
DORECOVER
spfile
set db_unique_name='SBDB'
set log_archive_dest_1='location=/standby/arch02'
set log_archive_dest_2='service=PROD async lgwr register valid_for=(online_logfile,primary_role) db_unique_name=PROD'
set standby_file_management='AUTO'
set fal_server='PROD'
set fal_client='SBDB'
set control_files='/standby/oradata/SBDB/controlfile/control1.ctl','/standby/oradata/SBDB/controlfile/control2.ctl'
set db_file_name_convert='PROD','SBDB'
set log_file_name_convert='PROD','SBDB'
set memory_target='0'
set sga_target='400M';

具体执行以上脚本,我们需要同时登录target database PROD和auxiliary instance辅助实例SBDB:

[maclean@rh6 ~]$ echo $ORACLE_SID
PROD

[maclean@rh6 ~]$ rman target / auxiliary sys/oracle@SBDB cmdfile=duplicate_act_standby.rcv log=das.log

connected to target database: PROD (DBID=158660885)
connected to auxiliary database: PROD (not mounted)

RMAN> duplicate target database
2> for standby
3> from active database
4> DORECOVER
5> spfile
6> set db_unique_name='SBDB'
7> set log_archive_dest_1='location=/standby/arch02'
8> set log_archive_dest_2='service=PROD async lgwr register valid_for=(online_logfile,primary_role) db_unique_name=PROD'
9> set standby_file_management='AUTO'
10> set fal_server='PROD'
11> set fal_client='SBDB'
12> set control_files='/standby/oradata/SBDB/controlfile/control1.ctl','/standby/oradata/SBDB/controlfile/control2.ctl'
13> set db_file_name_convert='PROD','SBDB'
14> set log_file_name_convert='PROD','SBDB'
15> set memory_target='0'
16> set sga_target='400M';
17>
Starting Duplicate Db at 26-MAR-11
using target database control file instead of recovery catalog
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: SID=96 device type=DISK

contents of Memory Script:
{
   backup as copy reuse
   targetfile  '/s01/product/11.2.0/dbhome_2/dbs/orapwPROD' auxiliary format
 '/s01/product/11.2.0/dbhome_2/dbs/orapwSBDB'   targetfile
 '/s01/product/11.2.0/dbhome_2/dbs/spfilePROD.ora' auxiliary format
 '/s01/product/11.2.0/dbhome_2/dbs/spfileSBDB.ora'   ;
   sql clone "alter system set spfile= ''/s01/product/11.2.0/dbhome_2/dbs/spfileSBDB.ora''";
}
executing Memory Script

Starting backup at 26-MAR-11
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=17 device type=DISK
Finished backup at 26-MAR-11

sql statement: alter system set spfile= ''/s01/product/11.2.0/dbhome_2/dbs/spfileSBDB.ora''

contents of Memory Script:
{
   sql clone "alter system set  db_unique_name =
 ''SBDB'' comment=
 '''' scope=spfile";
   sql clone "alter system set  log_archive_dest_1 =
 ''location=/standby/arch02'' comment=
 '''' scope=spfile";
   sql clone "alter system set  log_archive_dest_2 =
 ''service=PROD async lgwr register valid_for=(online_logfile,primary_role) db_unique_name=PROD'' comment=
 '''' scope=spfile";
   sql clone "alter system set  standby_file_management =
 ''AUTO'' comment=
 '''' scope=spfile";
   sql clone "alter system set  fal_server =
 ''PROD'' comment=
 '''' scope=spfile";
   sql clone "alter system set  fal_client =
 ''SBDB'' comment=
 '''' scope=spfile";
   sql clone "alter system set  control_files =
 ''/standby/oradata/SBDB/controlfile/control1.ctl'', ''/standby/oradata/SBDB/controlfile/control2.ctl'' comment=
 '''' scope=spfile";
   sql clone "alter system set  db_file_name_convert =
 ''PROD'', ''SBDB'' comment=
 '''' scope=spfile";
   sql clone "alter system set  log_file_name_convert =
 ''PROD'', ''SBDB'' comment=
 '''' scope=spfile";
   sql clone "alter system set  memory_target =
 0 comment=
 '''' scope=spfile";
   sql clone "alter system set  sga_target =
 400M comment=
 '''' scope=spfile";
   shutdown clone immediate;
   startup clone nomount;
}
executing Memory Script

sql statement: alter system set  db_unique_name =  ''SBDB'' comment= '''' scope=spfile

sql statement: alter system set  log_archive_dest_1 =  ''location=/standby/arch02'' comment= '''' scope=spfile

sql statement: alter system set  log_archive_dest_2 =  
''service=PROD async lgwr register valid_for=(online_logfile,primary_role) db_unique_name=PROD'' 
comment= '''' scope=spfile

sql statement: alter system set  standby_file_management =  ''AUTO'' comment= '''' scope=spfile

sql statement: alter system set  fal_server =  ''PROD'' comment= '''' scope=spfile

sql statement: alter system set  fal_client =  ''SBDB'' comment= '''' scope=spfile

sql statement: alter system set  control_files =  ''/standby/oradata/SBDB/controlfile/control1.ctl'',
''/standby/oradata/SBDB/controlfile/control2.ctl''
comment= '''' scope=spfile

sql statement: alter system set  db_file_name_convert =  ''PROD'', ''SBDB'' comment= '''' scope=spfile

sql statement: alter system set  log_file_name_convert =  ''PROD'', ''SBDB'' comment= '''' scope=spfile

sql statement: alter system set  memory_target =  0 comment= '''' scope=spfile

sql statement: alter system set  sga_target =  400M comment= '''' scope=spfile

Oracle instance shut down

connected to auxiliary database (not started)
Oracle instance started

Total System Global Area     417546240 bytes

Fixed Size                     2227072 bytes
Variable Size                134218880 bytes
Database Buffers             272629760 bytes
Redo Buffers                   8470528 bytes

contents of Memory Script:
{
   backup as copy current controlfile for standby auxiliary format  '/standby/oradata/SBDB/controlfile/control1.ctl';
   restore clone controlfile to  '/standby/oradata/SBDB/controlfile/control2.ctl' from
 '/standby/oradata/SBDB/controlfile/control1.ctl';
}
executing Memory Script

Starting backup at 26-MAR-11
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile copy
copying standby control file
output file name=/s01/product/11.2.0/dbhome_2/dbs/snapcf_PROD.f tag=TAG20110326T195144 RECID=2 STAMP=746826704
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
Finished backup at 26-MAR-11

Starting restore at 26-MAR-11
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: SID=134 device type=DISK

channel ORA_AUX_DISK_1: copied control file copy
Finished restore at 26-MAR-11

contents of Memory Script:
{
   sql clone 'alter database mount standby database';
}
executing Memory Script

sql statement: alter database mount standby database

contents of Memory Script:
{
   set newname for tempfile  1 to
 "/standby/oradata/SBDB/datafile/o1_mf_temp_6rvjsmr4_.tmp";
   switch clone tempfile all;
   set newname for datafile  1 to
 "/standby/oradata/SBDB/datafile/o1_mf_system_6rvjrtxh_.dbf";
   set newname for datafile  2 to
 "/standby/oradata/SBDB/datafile/o1_mf_sysaux_6rvjs6vh_.dbf";
   set newname for datafile  3 to
 "/standby/oradata/SBDB/datafile/o1_mf_undotbs1_6rvjsjjg_.dbf";
   set newname for datafile  4 to
 "/standby/oradata/SBDB/datafile/o1_mf_users_6rvjsy5q_.dbf";
   backup as copy reuse
   datafile  1 auxiliary format
 "/standby/oradata/SBDB/datafile/o1_mf_system_6rvjrtxh_.dbf"   datafile
 2 auxiliary format
 "/standby/oradata/SBDB/datafile/o1_mf_sysaux_6rvjs6vh_.dbf"   datafile
 3 auxiliary format
 "/standby/oradata/SBDB/datafile/o1_mf_undotbs1_6rvjsjjg_.dbf"   datafile
 4 auxiliary format
 "/standby/oradata/SBDB/datafile/o1_mf_users_6rvjsy5q_.dbf"   ;
   sql 'alter system archive log current';
}
executing Memory Script

executing command: SET NEWNAME

renamed tempfile 1 to /standby/oradata/SBDB/datafile/o1_mf_temp_6rvjsmr4_.tmp in control file

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

Starting backup at 26-MAR-11
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile copy
input datafile file number=00001 name=/standby/oradata/PROD/datafile/o1_mf_system_6rvjrtxh_.dbf
output file name=/standby/oradata/SBDB/datafile/o1_mf_system_6rvjrtxh_.dbf tag=TAG20110326T195152
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting datafile copy
input datafile file number=00002 name=/standby/oradata/PROD/datafile/o1_mf_sysaux_6rvjs6vh_.dbf
output file name=/standby/oradata/SBDB/datafile/o1_mf_sysaux_6rvjs6vh_.dbf tag=TAG20110326T195152
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting datafile copy
input datafile file number=00003 name=/standby/oradata/PROD/datafile/o1_mf_undotbs1_6rvjsjjg_.dbf
output file name=/standby/oradata/SBDB/datafile/o1_mf_undotbs1_6rvjsjjg_.dbf tag=TAG20110326T195152
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile copy
input datafile file number=00004 name=/standby/oradata/PROD/datafile/o1_mf_users_6rvjsy5q_.dbf
output file name=/standby/oradata/SBDB/datafile/o1_mf_users_6rvjsy5q_.dbf tag=TAG20110326T195152
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
Finished backup at 26-MAR-11

sql statement: alter system archive log current

contents of Memory Script:
{
   backup as copy reuse
   archivelog like  "/standby/arch01/1_17_746822549.dbf" auxiliary format
 "/standby/arch02/1_17_746822549.dbf"   ;
   catalog clone archivelog  "/standby/arch02/1_17_746822549.dbf";
   switch clone datafile all;
}
executing Memory Script

Starting backup at 26-MAR-11
using channel ORA_DISK_1
channel ORA_DISK_1: starting archived log copy
input archived log thread=1 sequence=17 RECID=3 STAMP=746826751
output file name=/standby/arch02/1_17_746822549.dbf RECID=0 STAMP=0
channel ORA_DISK_1: archived log copy complete, elapsed time: 00:00:01
Finished backup at 26-MAR-11

cataloged archived log
archived log file name=/standby/arch02/1_17_746822549.dbf RECID=1 STAMP=746826752

datafile 1 switched to datafile copy
input datafile copy RECID=2 STAMP=746826752 file name=/standby/oradata/SBDB/datafile/o1_mf_system_6rvjrtxh_.dbf
datafile 2 switched to datafile copy
input datafile copy RECID=3 STAMP=746826752 file name=/standby/oradata/SBDB/datafile/o1_mf_sysaux_6rvjs6vh_.dbf
datafile 3 switched to datafile copy
input datafile copy RECID=4 STAMP=746826752 file name=/standby/oradata/SBDB/datafile/o1_mf_undotbs1_6rvjsjjg_.dbf
datafile 4 switched to datafile copy
input datafile copy RECID=5 STAMP=746826752 file name=/standby/oradata/SBDB/datafile/o1_mf_users_6rvjsy5q_.dbf

contents of Memory Script:
{
   set until scn  242517;
   recover
   standby
   clone database
    delete archivelog
   ;
}
executing Memory Script

executing command: SET until clause

Starting recover at 26-MAR-11
using channel ORA_AUX_DISK_1

starting media recovery

archived log for thread 1 with sequence 17 is already on disk as file /standby/arch02/1_17_746822549.dbf
archived log file name=/standby/arch02/1_17_746822549.dbf thread=1 sequence=17
media recovery complete, elapsed time: 00:00:00
Finished recover at 26-MAR-11
Finished Duplicate Db at 26-MAR-11

Recovery Manager complete.

[maclean@rh6 ~]$ export ORACLE_SID=PROD
[maclean@rh6 ~]$ sqlplus  / as sysdba

/* 在主库PROD上设置到物理备库SBDB的归档目的地 */

SQL> alter system set log_archive_dest_2='service=sbdb lgwr async  
valid_for=(online_logfiles,primary_role) db_unique_name=SBDB';
System altered.

以上完成了对物理备库Physical Standby的配置,紧接着我们来配合Data Broker:

SQL> alter system set dg_broker_start=true;
System altered.

[maclean@rh6 ~]$ export ORACLE_SID=SBDB
[maclean@rh6 ~]$ sqlplus  / as sysdba

SQL>  alter system set dg_broker_start=true;
System altered.

[maclean@rh6 ~]$ dgmgrl sys/oracle@PROD
DGMGRL for Linux: Version 11.2.0.2.0 - 64bit Production

Copyright (c) 2000, 2009, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected.

DGMGRL>  create CONFIGURATION PROD as PRIMARY DATABASE IS PROD  CONNECT IDENTIFIER IS PROD;
Configuration "prod" created with primary database "prod"

DGMGRL> add database sbdb  AS CONNECT IDENTIFIER IS sbdb MAINTAINED AS PHYSICAL;
Database "sbdb" added

DGMGRL> enable configuration;
Enabled.

DGMGRL> edit database prod set property LogXptMode='sync';
Property "logxptmode" updated
DGMGRL> edit database sbdb set property LogXptMode='sync';
Property "logxptmode" updated

/* 修改当前DG的保护模式为最大可用模式MaxAvailability */

DGMGRL> edit CONFIGURATION SET PROTECTION MODE  as MaxAvailability;
Succeeded.

DGMGRL> show configuration;

Configuration - prod

  Protection Mode: MaxAvailability
  Databases:
    prod - Primary database
    sbdb - Physical standby database
      Error: ORA-16766: Redo Apply is stopped

Fast-Start Failover: DISABLED

Configuration Status:
ERROR

/* 以上可以看到物理备库SBDB上的REDO APPLY被停止了,我们可以在DGMGRL中启动其REDO APPLY */

DGMGRL> edit database sbdb set state='APPLY-ON';
Succeeded.

DGMGRL> show configuration;

Configuration - prod

  Protection Mode: MaxAvailability
  Databases:
    prod - Primary database
    sbdb - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

/* 以下使用DGMGRL来回切换主备库角色,十分方便 */

DGMGRL> switchover to sbdb;
Performing switchover NOW, please wait...
New primary database "sbdb" is opening...
Operation requires shutdown of instance "PROD" on database "prod"
Shutting down instance "PROD"...
ORACLE instance shut down.
Operation requires startup of instance "PROD" on database "prod"
Starting instance "PROD"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "sbdb"

DGMGRL> switchover to prod;
Performing switchover NOW, please wait...
New primary database "prod" is opening...
Operation requires shutdown of instance "SBDB" on database "sbdb"
Shutting down instance "SBDB"...
ORACLE instance shut down.
Operation requires startup of instance "SBDB" on database "sbdb"
Starting instance "SBDB"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "prod"

fast incremental backup failed on standby database

一套Linux上的11.1.0.7的physical standby物理备库在使用fast incremental backup进行高于0级的增量备份时会出现ORA-19648错误,其出错记录如下:

RMAN>  backup incremental level 1 database;

Starting backup at 22-MAR-11
using channel ORA_DISK_1
using channel ORA_DISK_2
channel ORA_DISK_1: starting incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=/standby/oradata/SBDB2/datafile/o1_mf_sysaux_22m6ov92_.dbf
input datafile file number=00003 name=/standby/oradata/SBDB2/datafile/o1_mf_undotbs1_23m6ovap_.dbf
input datafile file number=00006 name=/standby/oradata/SBDB2/datafile/o1_mf_enc_25m6ovba_.dbf
channel ORA_DISK_1: starting piece 1 at 22-MAR-11
channel ORA_DISK_2: starting incremental level 1 datafile backup set
channel ORA_DISK_2: specifying datafile(s) in backup set
input datafile file number=00001 name=/standby/oradata/SBDB2/datafile/o1_mf_system_21m6ov92_.dbf
input datafile file number=00005 name=/standby/oradata/SBDB2/datafile/o1_mf_example_24m6ovar_.dbf
input datafile file number=00004 name=/standby/oradata/SBDB2/datafile/o1_mf_users_26m6ovba_.dbf
channel ORA_DISK_2: starting piece 1 at 22-MAR-11
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 03/22/2011 20:20:28
ORA-19648: datafile 2: incremental-start SCN equals checkpoint SCN
ORA-19640: datafile checkpoint is SCN 1249353 time 03/09/2011 05:50:27
continuing other job steps, job failed will not be re-run
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ORA_DISK_2 channel at 03/22/2011 20:22:33
ORA-19648: datafile 1: incremental-start SCN equals checkpoint SCN
ORA-19640: datafile checkpoint is SCN 1249352 time 03/09/2011 05:50:26

经过分析该ORA-19640->ORA-19648是由11.1.0.7上的Bug引起的,MOS已经确认其为Bug 9288598:

Affects:
Product (Component)	 Oracle Server (Rdbms)
Range of versions believed to be affected	 Versions BELOW 12.1
Versions confirmed as being affected	
11.1.0.7
Platforms affected	 Generic (all / most platforms affected)
Fixed:

This issue is fixed in	
12.1 (Future Release)
11.2.0.2 (Server Patch Set)

An RMAN Backup incremental level 1 can fail with ORA-19648 / ORA-19640 if the standby database is open read only rather than just skipping the datafile.
PROBLEM:
--------
Following error occurring during incremental backup:

Starting backup at 02-DEC-2009 20:37:44
allocated channel: ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: SID=154 device type=SBT_TAPE
channel ORA_SBT_TAPE_1: Veritas NetBackup for Oracle - Release 6.5 
(2008052301)
channel ORA_SBT_TAPE_1: starting incremental level 1 datafile backup set
channel ORA_SBT_TAPE_1: specifying datafile(s) in backup set
input datafile file number=00002 
name=/data/ora_data01/KAHCB4P/KAHCB4P_sysaux_01.dbf
input datafile file number=00001 
name=/data/ora_data01/KAHCB4P/KAHCB4P_system_01.dbf
input datafile file number=00003 
name=/data/ora_data01/KAHCB4P/KAHCB4P_undo_01.dbf
input datafile file number=00004 
name=/data/ora_data01/KAHCB4P/KAHCB4P_users_01.dbf
input datafile file number=00005 
name=/data/ora_data01/KAHCB4P/KAHCB4P_tools_01.dbf
input datafile file number=00006 
name=/data/ora_data01/KAHCB4P/KAHCB4P_backup_test_01.dbf
channel ORA_SBT_TAPE_1: starting piece 1 at 02-DEC-2009 20:38:01
RMAN-3009: failure of backup command on ORA_SBT_TAPE_1 channel at 12/02/2009 
20:38:02
ORA-19648: datafile 2: incremental-start SCN equals checkpoint SCN
ORA-19640: datafile checkpoint is SCN 3911695 time 11/27/2009 12:36:48
continuing other job steps, job failed will not be re-run
channel ORA_SBT_TAPE_1: starting incremental level 1 datafile backup set
channel ORA_SBT_TAPE_1: specifying datafile(s) in backup set
including current control file in backup set
including current SPFILE in backup set

DIAGNOSTIC ANALYSIS:
--------------------
Same problem as Bug 6903819
Requested backport, but this hasn't fixed the problem.
Confirmed that the patch was installed correctly and relinked successfully.

RELEASE NOTES:
]]Backup incremental leve 1 fails with ORA-19648 and ORA-19640 when standby
]]database opened for read only.
*** 02/14/10 02:43 pm *** (ADD: Impact/Symptom->FEATURE UNUSABLE )

REDISCOVERY INFORMATION:
Backup incremental level 1 fails with ORA-19648 and the standby database is
open read-only.
WORKAROUND:
None

可以从上述Note中看到该Bug需要到11.2.0.2中才得到fix,那么在11.1.0.7上我们有什么办法能解决或者绕过该问题吗?
经过测试,我得到2种workaround的方法:
该ORA-19648错误首先可以通过在物理备库(physical standby)上停止介质恢复(MRP)进程的方式来workaround:

SQL>  alter database recover managed standby database cancel;
Database altered.

RMAN> backup incremental level 1 database;

Starting backup at 22-MAR-11
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=141 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=140 device type=DISK
channel ORA_DISK_1: starting incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=/standby/oradata/SBDB2/datafile/o1_mf_sysaux_22m6ov92_.dbf
input datafile file number=00004 name=/standby/oradata/SBDB2/datafile/o1_mf_users_26m6ovba_.dbf
input datafile file number=00006 name=/standby/oradata/SBDB2/datafile/o1_mf_enc_25m6ovba_.dbf
channel ORA_DISK_1: starting piece 1 at 22-MAR-11
channel ORA_DISK_2: starting incremental level 1 datafile backup set
channel ORA_DISK_2: specifying datafile(s) in backup set
input datafile file number=00001 name=/standby/oradata/SBDB2/datafile/o1_mf_system_21m6ov92_.dbf
input datafile file number=00003 name=/standby/oradata/SBDB2/datafile/o1_mf_undotbs1_23m6ovap_.dbf
input datafile file number=00005 name=/standby/oradata/SBDB2/datafile/o1_mf_example_24m6ovar_.dbf
channel ORA_DISK_2: starting piece 1 at 22-MAR-11
channel ORA_DISK_1: finished piece 1 at 22-MAR-11
piece handle=/standby/backup/2km7suui_1_1.bak tag=TAG20110322T212538 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:33
channel ORA_DISK_2: finished piece 1 at 22-MAR-11
piece handle=/standby/backup/2lm7suv1_1_1.bak tag=TAG20110322T212538 comment=NONE
channel ORA_DISK_2: backup set complete, elapsed time: 00:00:27
Finished backup at 22-MAR-11

Starting Control File and SPFILE Autobackup at 22-MAR-11
piece handle=/standby/backup/c-157018592-20110322-01.bak comment=NONE
Finished Control File and SPFILE Autobackup at 22-MAR-11

该ORA-19648错误也可以通过禁用fast incremental backup的块跟踪(block change tracking)特性来绕过问题,当然在可能的情况下我们更推荐使用上面那种方法,因为毕竟还可以利用到fast incremental backup特性:

SQL> alter database recover managed standby database using current logfile disconnect;
Database altered.

SQL> select process,status from v$managed_standby;

PROCESS   STATUS
--------- ------------
ARCH	  CONNECTED
ARCH	  CONNECTED
ARCH	  CONNECTED
ARCH	  CONNECTED
MRP0	  APPLYING_LOG

SQL> select status from v$block_change_tracking;

STATUS
----------
ENABLED


SQL> alter database disable block change tracking;
Database altered.


RMAN>  backup incremental level 2 database;

Starting backup at 22-MAR-11
using channel ORA_DISK_1
using channel ORA_DISK_2
channel ORA_DISK_1: starting incremental level 2 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=/standby/oradata/SBDB2/datafile/o1_mf_sysaux_22m6ov92_.dbf
input datafile file number=00004 name=/standby/oradata/SBDB2/datafile/o1_mf_users_26m6ovba_.dbf
input datafile file number=00006 name=/standby/oradata/SBDB2/datafile/o1_mf_enc_25m6ovba_.dbf
channel ORA_DISK_1: starting piece 1 at 22-MAR-11
channel ORA_DISK_2: starting incremental level 2 datafile backup set
channel ORA_DISK_2: specifying datafile(s) in backup set
input datafile file number=00001 name=/standby/oradata/SBDB2/datafile/o1_mf_system_21m6ov92_.dbf
input datafile file number=00003 name=/standby/oradata/SBDB2/datafile/o1_mf_undotbs1_23m6ovap_.dbf
input datafile file number=00005 name=/standby/oradata/SBDB2/datafile/o1_mf_example_24m6ovar_.dbf
channel ORA_DISK_2: starting piece 1 at 22-MAR-11
channel ORA_DISK_1: finished piece 1 at 22-MAR-11
piece handle=/standby/backup/2nm7sv8d_1_1.bak tag=TAG20110322T213052 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:15
channel ORA_DISK_2: finished piece 1 at 22-MAR-11
piece handle=/standby/backup/2om7sv8s_1_1.bak tag=TAG20110322T213052 comment=NONE
channel ORA_DISK_2: backup set complete, elapsed time: 00:00:15
Finished backup at 22-MAR-11

Starting Control File and SPFILE Autobackup at 22-MAR-11
piece handle=/standby/backup/c-157018592-20110322-02.bak comment=NONE
Finished Control File and SPFILE Autobackup at 22-MAR-11

11g Database Installation flow

Oracle Database 11g在OUI下的安装流程图:
11g_OUI_installation

Large Memory Footprints on AIX

Connor Mcdonald-一位Oracle极客为我们分享了一个AIX平台上11g独享服务进程内存占用过量的问题,该问题最后被确认为Bug”11G SERVER PROCESSES CONSUMING MUCH MORE MEMORY THAT 10G OR 9I”,相关文档如下:

 

Memory Footprint For Dedicated Server Processes More Than Doubled After 11g Upgrade On AIX Platform [ID 1246995.1]

Bug 9796810: 11G SERVER PROCESSES CONSUMING MUCH MORE MEMORY THAT 10G OR 9I

Bug 10190759: PROCESSES CONSUMING ADDITIONAL MEMORY DUE TO ‘USLA HEAP’

可以看到上述问题仅发生在从9i/10g升级到11g后,作为一个已确认的升级Bug值得我们大家去关注;最近几年这样的升级会越来越多,同时希望该Bug能在11.2.0.3中修复。

实际上我在10.2.0.3上就遇到过类似的Process Large Footprints问题:用户在打上一个one-off patch[6110331]后单个server process的rss量明显上升,主机的内存使用量大幅提高,虽然这个问题同样提交了SR,但最后没有确认为Bug;用户试图询问Oracle GCS关于rss上升的原因,但语焉而不详。

Search Criteria:AIX 11.2

Memory Footprint For Dedicated Server Processes More Than Doubled After 11g Upgrade On AIX Platform (Doc ID 1246995.1)

1. Have you installed the patch 10190759 ?

Review the note:
Memory Footprint For Dedicated Server Processes More Than Doubled After 11g Upgrade On AIX Platform (Doc ID 1246995.1)

If you have not installed the patch ?
–>>there is one available for 11.2.0.2.0, 11.2.0.2.2, 11.2.0.2.3

If you need me to review the patches you have installed you can upload the opatch listing?

opatch lsinventory -patch -detail

2. If you have already installed the patch 10190759 then

The additional memory seen allocated to oracle processes in the 11.2 release is a consequence of the additional link options added to the oracle link
line, -bexpfull and -brtllib. The two link options were specifically added in 11.2.0.1 to support the online patching feature.
Patch Name or Number: 10190759

 

Changes in the make file have been implemented such that you can relink without these options (-bexpfull and -brtllib) to avoid
additional memory overhead incurred by adding these options.These changes are available via a one-off patch.

This is a known bug: BUG:10190759 – PROCESSES CONSUMING ADDITIONAL MEMORY DUE TO ‘USLA HEAP’

Install  Patch: 10190759

Setup Oracle Direct NFS Client

在Oracle 11g中引入了Direct Network File System(Oracle Direct NFS)的新特性,通过一个打包在Oracle内核中的NFS客户机以改善实例使用NFS时的性能,同时进一步完善了通过NFS实现RAC的解决方案。常规的NFS客户端软件一般由操作系统供应商提供,这类NFS客户端不会专门为Oracle数据文件的IO做优化。而通过内建的Oracle Direct NFS,数据库将可以直接访问NFS服务器上的文件,避免由OS内核NFS造成的额外开销。Oracle宣称由以上优化所带来的性能提升,在DSS环境中超过40%,而在OLTP环境中超过10%(详见<Oracle Database 11g  Direct NFS Client a white paper>)。

接下来我们将通过实例来演示如何构建Oracle Direct NFS客户机,实际上这并不困难;Direct NFS客户机会以如下顺序查找装载点的设置信息:

  • $ORACLE_HOME/dbs/oranfstab 作用域为$ORACLE_HOME相关的数据库
  • /etc/oranfstab 作用域为主机上所有可用数据库
  • 最后为/etc/mtab配置文件,以确定可用的NFS装载点

我们一般推荐使用$ORACLE_HOME/dbs/oranfstab来配置Direct NFS客户机;该oranfstab配置文件可以包括Server,path,export以及mount参数,各参数代表的属性如下:

  • Server:NFS服务器名
  • Path:到达NFS服务器的最多4个网络路径,可以是IP或者主机名
  • Export:从NFS服务器导出的路径
  • Mount:NFS的本地装载点

需要注意的是如果是在使用NFS实现RAC的情景中,那么必须使用/etc/oranfstab配置文件。并且该配置文件应当在所有节点上时同步的。
在正式启用Direct NFS客户机前,NFS文件系统应当已由常规NFS方式mount并且可用。为了启用Direct NFS client,我们还需要将标准的Oracle磁盘管理库(Oracle Disk Manager (ODM) library)替换为支持Direct NFS client的NFS ODM。可以通过建立从标准ODM库指向NFS ODM库的符号链接来完成以上工作,但是需要注意的是以上操作仅能在实例关闭的情况下才能实施并且有效。

SQL> shutdown immediate;

[maclean@rh2 ~]$ cd $ORACLE_HOME/lib
[maclean@rh2 lib]$ mv libodm11.so libodm11.so.old
[maclean@rh2 lib]$ ln -s libnfsodm11.so libodm11.so
[maclean@rh2 lib]$ ls -l libodm11.so
lrwxrwxrwx 1 maclean oinstall 14 Feb 18 19:27 libodm11.so -> libnfsodm11.so

接下来我们将正式启用Direct NFS Client,并会简单测试其性能:

[root@rh2 ~]# showmount -e  nas
Export list for nas:
/d01 rh2

/* 以常规方式装载NFS文件系统 */
[root@rh2 ~]# mount -t nfs nas:/d01 /d01

[maclean@rh2 ~]$ cat $ORACLE_HOME/dbs/oranfstab
server: nas
path: 192.168.1.188
export: /d01  mount: /d01

SQL> startup;

启动阶段告警日志会出现以下信息:
Oracle instance running with ODM: Oracle Direct NFS ODM Library Version 3.0
.................
Direct NFS: attempting to mount /d01 on filer nas defined in oranfstab
Direct NFS: channel config is:
     channel id [0] local [] path [192.168.1.188]
Direct NFS: mount complete dir /d01 on nas mntport 998 nfsport 2049

SQL> desc v$dnfs_servers;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                                 NUMBER
 SVRNAME                                            VARCHAR2(255)
 DIRNAME                                            VARCHAR2(1024)
 MNTPORT                                            NUMBER
 NFSPORT                                            NUMBER
 WTMAX                                              NUMBER
 RTMAX                                              NUMBER

/*可以通过查询V$DNFS_SERVERS动态视图来了解NFS客户机的装载情况 */

SQL> col svrname for a10
SQL> col dirname for a10
SQL> select svrname,dirname,mntport,nfsport from v$dnfs_servers;

SVRNAME    DIRNAME       MNTPORT    NFSPORT
---------- ---------- ---------- ----------
nas        /d01              998       2049

/* 此外V$DNFS_FILES视图提供了当前Direct NFS打开的文件
         V$DNFS_CHANNELS视图提供了DNFS为服务器打开的网络路径
         V$DNFS_STATS视图提供了DNFS的性能统计信息 */

SQL> set timing on;
SQL> create tablespace nfs_perf datafile '/d01/perf01.dbf' size 2000M;
Tablespace created.

Elapsed: 00:00:30.91

SQL> shutdown immediate;

/* 通过还原libodm11.so库文件和删除oranfstab配置文件可以禁用Direct NFS */

[maclean@rh2 lib]$ cp libodm11.so.old libodm11.so
[maclean@rh2 lib]$ rm $ORACLE_HOME/dbs/oranfstab

SQL> select svrname,dirname,mntport,nfsport from v$dnfs_servers;
no rows selected

SQL> set timing on;
SQL> create tablespace nfs_kernel datafile '/d01/kernel01.dbf' size 2000M;
Tablespace created.

Elapsed: 00:00:29.45

/* 似乎从简单的数据文件创建时间上体现不出Oracle Direct Nfs的优势...*/

That's ok 

沪ICP备14014813号-2

沪公网安备 31010802001379号