Goldengate各build与Oracle数据库版本间的兼容性

因为在Edelivery.oracle.com上找不到Goldengate for Oracle 8i的build,所以一度猜想是不是能用GG for Oracle 9i对8i做extract。之后在metalink上发现了这个文档《Compatibility between OGG builds and Oracle database Versions》,明确介绍了ogg的build需要与数据库版本一致,即:

针对extract来说,Oracle数据库版本应与Oracle Golden Gate的build信息完全一致

举例而言

OGG 10g build(如Oracle GoldenGate V11.1.1.0.0 for Oracle 10g):该build可以在Oracle 10.1或10.2版本的数据库上做extract
OGG oracle 10.1 build:该build仅可以为Oracle 10.1版本的数据库做extract,其他版本均不可以
OGG oracle 10.2 build:该build仅可以为Oracle 10.2版本的数据库做extract,其他版本均不可以

针对replicat来说,数据库主版本号(major database version)应与OGG的build一致

举例而言

OGG 10g build:可以针对Oracle 10.1和10.2版本的数据库做replicat操作,其他版本均不可以
OGG oracle 10.1 build:可以针对Oracle 10.1和10.2版本的数据库做replicat操作,其他版本均不可以
OGG oracle 10.2 build:可以针对Oracle 10.1和10.2版本的数据库做replicat操作,其他版本均不可以

其他9i或者11g的OGG build与Oracle database版本间的兼容性也如上述关系一般。
该文档[1086154.1]还指出以上版本要求是因为各数据库版本间可能存在不同的data dictionary calls,而OGG中这部分代码是固定写死的:
“The reason behind these requirements is that the data dictionary calls may be different from one version of the database to the other and our code is database version specific”;
从这个文档来分析,不是因为Oracle各版本所产生的日志文件的格式存在差异而导致OGG无法做extract操作,那么如果在一个10g的数据库中设置了一个较低的compatible兼容性参数也不会导致OGG for Oracle 10g无法正确抽取日志文件了。
上述观点仍需测试,to be continued…….

ORA-00600:[kclchkinteg_2]及[kjmsm_epc]内部错误一例

一套AIX上的9.2.0.6 RAC系统,alert日志中最初报LMS进程(Lock Manager Server process,即锁服务管理进程,仅出现在RAC系统中)遭遇ORA-00600[kclchkinteg_2],继而出现ORA-00600[[kjmsm_epc]内部错误导致实例crash(instance crashed)。相关日志如下:

Tue Dec 1 01:20:25 2009
Errors in file /oracle/admin/crmdb/bdump/crmdb1_lms7_2143190.trc:
ORA-00600: internal error code, arguments: [kclchkinteg_2], [], [], [], [], [], [], []
Tue Dec 1 01:20:25 2009
Errors in file /oracle/admin/crmdb/bdump/crmdb1_lms7_2143190.trc:
ORA-00600: internal error code, arguments: [kclchkinteg_2], [], [], [], [], [], [], []
Tue Dec 1 01:20:26 2009
Trace dumping is performing id=[cdmp_20091201012026]
Tue Dec 1 01:20:40 2009
Errors in file /oracle/admin/crmdb/bdump/crmdb1_lms7_2143190.trc:
ORA-00600: internal error code, arguments: [kjmsm_epc], [], [], [], [], [], [], []
Tue Dec 1 01:20:41 2009
Errors in file /oracle/admin/crmdb/bdump/crmdb1_lms7_2143190.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [kjmsm_epc], [], [], [], [], [], [], []
Tue Dec 1 01:20:43 2009

经过和metalink确认,认为此次实例意外终止时由于”Bug 3671867 – OERI[kclchkinteg_2] possible in RAC environment”:

Hdr: 3671867 10.1.0.2 RDBMS 10.1.0.2 RAC PRODID-5 PORTID-212 ORA-600
Abstract: INSTANCE TERMINATED WITH ORA-600 [KCLCHKINTEG_2]
PROBLEM:
--------
During testing, one instance was terminated with ORA-600 [KCLCHKINTEG_2]

DIAGNOSTIC ANALYSIS:
--------------------
Alert log contains:

Thu Jun  3 12:57:26 2004
Private_strands 0 at log switch
Thread 1 advanced to log sequence 153
  Current log# 1 seq# 153 mem# 0: +EMDBD1/dbotazc0/onlinelog/group_1.267.1
  Current log# 1 seq# 153 mem# 1: +EMDBD1/dbotazc0/onlinelog/group_1.268.1
Thu Jun  3 13:41:45 2004
Private_strands 0 at log switch
Thread 1 advanced to log sequence 154
  Current log# 2 seq# 154 mem# 0: +EMDBD1/dbotazc0/onlinelog/group_2.269.1
  Current log# 2 seq# 154 mem# 1: +EMDBD1/dbotazc0/onlinelog/group_2.270.1
Thu Jun  3 13:55:06 2004
Errors in file /DBA/nest/oracle/DBOTA1/logs/dbota1_dbw0_82244.trc:
ORA-600: internal error code, arguments: [kclchkinteg_2], [], [], [], [],
[], [], []
Thu Jun  3 13:55:08 2004
Errors in file /DBA/nest/oracle/DBOTA1/logs/dbota1_dbw0_82244.trc:
ORA-600: internal error code, arguments: [kclchkinteg_2], [], [], [], [],
[], [], []
Thu Jun  3 13:55:08 2004
DBW0: terminating instance due to error 471
Thu Jun  3 13:55:08 2004
Trace dumping is performing id=[cdmp_20040603135508]
Thu Jun  3 13:55:10 2004
Dump system state for local instance only
Thu Jun  3 13:55:10 2004
Trace dumping is performing id=[cdmp_20040603135510]
The only trace file we have been sent is the dbw trace.
WORKAROUND:
-----------
None
RELATED BUGS:
-------------
There are several which end in KCLCHKINTEG but not KCLCHKINTEG_2
REPRODUCIBILITY:
----------------
Appears only to have happened once so far.
TEST CASE:
----------
N/A
STACK TRACE:
------------
ksedmp ksfdmp kgerinv kgeasnmierr kclassertle kclchkinteg kclfwrite1 kcbbic1
kcbbiop kcbbdrv ksbabs ksbrdp opirip opidrv sou2o main start

SUPPORTING INFORMATION:
-----------------------
Bruce Carter has looked at this and suggested a bug be raised.
Affects:
    Product (Component)	Oracle Server (Rdbms)
    Range of versions believed to be affected	Versions < 10.2
    Versions confirmed as being affected
        * 10.1.0.3
    Platforms affected	Generic (all / most platforms affected)
Fixed:
    This issue is fixed in
        * 9.2.0.7 (Server Patch Set)
        * 10.1.0.4 (Server Patch Set)
        * 10.2.0.1 (Base Release)
Symptoms:
Related To:
    * Internal Error May Occur (ORA-600)
    * Instance May Crash
    * ORA-600 [kclchkinteg_2]
    * RAC (Real Application Clusters) / OPS
Description
    ORA-600 [kclchkinteg_2] possible in RAC environment

Oracle开发部分确认该3671867 bug已在9.2.0.8中得到修复,建议通过应用9.2.0.8补丁集或者打上bug 3671867的one-off patch来修复该问题;该Bug可能导致实例意外终止,因此其Severity也极高,值得手头仍有9i RAC系统需要管理的dba注意。

配置GoldenGate同步DDL语句(1)

在配置Goldengate同步DDL语句前,要求在数据库内完成一系列的预安装先题步骤;其中包括建立合适的Goldengate用户,如:

create user maclean identified by maclean;
grant dba to maclean;

/* 之后maclean用户将用以配置extract和pump */

之后我们需要执行一系列同步DLL语句要用到的GoldenGate脚本,这些脚本存放在GG的安装目录下:

[maclean@rh2 gg]$ pwd
/home/maclean/gg
[maclean@rh2 gg]$ ls -l dd*.sql
-r--r--r-- 1 maclean oinstall   1059 2010-03-12 ddl_cleartrace.sql
-r--r--r-- 1 maclean oinstall   4189 2010-03-12 ddl_ddl2file.sql
-r--r--r-- 1 maclean oinstall    746 2010-03-12 ddl_disable.sql
-r--r--r-- 1 maclean oinstall    692 2010-03-12 ddl_enable.sql
-r--r--r-- 1 maclean oinstall    388 2010-03-12 ddl_nopurgeRecyclebin.sql
-r--r--r-- 1 maclean oinstall  12424 2010-05-21 ddl_ora10.sql
-r--r--r-- 1 maclean oinstall   3863 2010-05-21 ddl_ora10upCommon.sql
-r--r--r-- 1 maclean oinstall  11064 2010-03-12 ddl_ora11.sql
-r--r--r-- 1 maclean oinstall  12365 2010-03-12 ddl_ora9.sql
-r--r--r-- 1 maclean oinstall   1026 2010-03-12 ddl_pin.sql
-r--r--r-- 1 maclean oinstall   1227 2010-03-12 ddl_purgeRecyclebin.sql
-r--r--r-- 1 maclean oinstall   3686 2010-05-13 ddl_remove.sql
-r--r--r-- 1 maclean oinstall    425 2009-06-29 ddl_session1.sql
-r--r--r-- 1 maclean oinstall   1053 2009-06-23 ddl_session.sql
-r-xr-xr-x 1 maclean oinstall 228606 06-29 13:48 ddl_setup.sql
-r--r--r-- 1 maclean oinstall   8872 2010-03-12 ddl_status.sql
-r--r--r-- 1 maclean oinstall   2506 2010-03-12 ddl_staymetadata_off.sql
-r--r--r-- 1 maclean oinstall   2501 2010-03-12 ddl_staymetadata_on.sql
-r--r--r-- 1 maclean oinstall   2955 2010-03-12 ddl_tracelevel.sql
-r--r--r-- 1 maclean oinstall   2543 2010-03-12 ddl_trace_off.sql
-r--r--r-- 1 maclean oinstall   2862 2010-03-12 ddl_trace_on.sql

/* 执行goldengate ddl同步安装脚本要求以SYSDBA身份登录 */

[maclean@rh2 gg]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Mon Dec 6 18:01:46 2010
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> alter system set recyclebin=off;
System altered.

/* 同步DDL要求我们关闭10g中的回收站特性 */

SQL> @marker_setup
Marker setup script
You will be prompted for the name of a schema for the GoldenGate database objects.
NOTE: The schema must be created prior to running this script.
NOTE: Stop all DDL replication before starting this installation.
Enter GoldenGate schema name:maclean
Marker setup table script complete, running verification script...
Please enter the name of a schema for the GoldenGate database objects:
Setting schema name to MACLEAN
MARKER TABLE
-------------------------------
OK

MARKER SEQUENCE
-------------------------------
OK

Script complete.

/* 以SYSDBA 登陆数据库并执行当前GG安装目录下的market_setup脚本,该脚本用以建立一个DDL标记表
   包括以下列:
   seqNo NUMBER NOT NULL, -- sequence number
   fragmentNo NUMBER NOT NULL, -- fragment number (message divided into fragments)
   optime CHAR(19) NOT NULL, -- time of operation
   TYPE VARCHAR2 (100) NOT NULL, -- type of marker
   SUBTYPE VARCHAR2 (100) NOT NULL, -- subtype of marker
   marker_text VARCHAR2 (4000) NOT NULL,
   -- fragment text (message divided into fragments numbered with fragmentNo)
*/

SQL> @ddl_setup

GoldenGate DDL Replication setup script

Verifying that current user has privileges to install DDL Replication...
Checking user sessions...

Check complete.

You will be prompted for the name of a schema for the GoldenGate database objects.
NOTE: For an Oracle 10g source, the system recycle bin must be disabled. For Oracle 11g and later, it can be enabled.
NOTE: The schema must be created prior to running this script.
NOTE: Stop all DDL replication before starting this installation.

Enter GoldenGate schema name:maclean

You will be prompted for the mode of installation.
To install or reinstall DDL replication, enter INITIALSETUP
To upgrade DDL replication, enter NORMAL
Enter mode of installation:INITIALSETUP

Working, please wait ...
Spooling to file ddl_setup_spool.txt
Using MACLEAN as a GoldenGate schema name, INITIALSETUP as a mode of installation.

Working, please wait ...

RECYCLEBIN must be empty.
This installation will purge RECYCLEBIN for all users.
To proceed, enter yes. To stop installation, enter no.

Enter yes or no:yes

DDL replication setup script complete, running verification script...
Please enter the name of a schema for the GoldenGate database objects:
Setting schema name to MACLEAN

DDLORA_GETTABLESPACESIZE STATUS:

Line/pos   Error
---------- -----------------------------------------------------------------
No errors  No errors

CLEAR_TRACE STATUS:

Line/pos   Error
---------- -----------------------------------------------------------------
No errors  No errors

CREATE_TRACE STATUS:

Line/pos   Error
---------- -----------------------------------------------------------------
No errors  No errors

TRACE_PUT_LINE STATUS:

Line/pos   Error
---------- -----------------------------------------------------------------
No errors  No errors

INITIAL_SETUP STATUS:

Line/pos   Error
---------- -----------------------------------------------------------------
No errors  No errors

DDLVERSIONSPECIFIC PACKAGE STATUS:

Line/pos   Error
---------- -----------------------------------------------------------------
No errors  No errors

DDLREPLICATION PACKAGE STATUS:

Line/pos   Error
---------- -----------------------------------------------------------------
No errors  No errors

DDLREPLICATION PACKAGE BODY STATUS:

Line/pos   Error
---------- -----------------------------------------------------------------
No errors  No errors

DDL HISTORY TABLE
-----------------------------------
OK
DDL HISTORY TABLE(1)
-----------------------------------
OK
DDL DUMP TABLES
-----------------------------------
OK
DDL DUMP COLUMNS
-----------------------------------
OK
DDL DUMP LOG GROUPS
-----------------------------------
OK
DDL DUMP PARTITIONS
-----------------------------------
OK
DDL DUMP PRIMARY KEYS
-----------------------------------
OK
DDL SEQUENCE
-----------------------------------
OK
GGS_TEMP_COLS
-----------------------------------
OK
GGS_TEMP_UK
-----------------------------------
OK
DDL TRIGGER CODE STATUS:
Line/pos   Error
---------- -----------------------------------------------------------------
No errors  No errors
DDL TRIGGER INSTALL STATUS
-----------------------------------
OK
DDL TRIGGER RUNNING STATUS
-----------------------------------
ENABLED
STAYMETADATA IN TRIGGER
-----------------------------------
OFF

DDL TRIGGER SQL TRACING
-----------------------------------
0

DDL TRIGGER TRACE LEVEL
-----------------------------------
0

LOCATION OF DDL TRACE FILE
--------------------------------------------------------------------------------
/s01/10gdb/admin/clinica/udump/ggs_ddl_trace.log

Analyzing installation status...

STATUS OF DDL REPLICATION
--------------------------------------------------------------------------------
SUCCESSFUL installation of DDL Replication software components

Script complete.

/* 以INITIALSETUP选项运行ddl_setup.sql 将在数据库中创建捕获DDL语句的Trigger等必要组件 */

SQL> @role_setup

GGS Role setup script

This script will drop and recreate the role GGS_GGSUSER_ROLE
To use a different role name, quit this script and then edit the params.sql script to
change the gg_role parameter to the preferred name. (Do not run the script.)

You will be prompted for the name of a schema for the GoldenGate database objects.
NOTE: The schema must be created prior to running this script.
NOTE: Stop all DDL replication before starting this installation.

Enter GoldenGate schema name:maclean
Wrote file role_setup_set.txt
PL/SQL procedure successfully completed.
Role setup script complete
Grant this role to each user assigned to the Extract, GGSCI, and Manager processes, by using the following SQL command:
GRANT GGS_GGSUSER_ROLE TO
where  is the user assigned to the GoldenGate processes.

/* role_setup脚本用以建立GGS_GGSUSER_ROLE角色 */

我们需要将该GGS_GGSUSER_ROLE授予给extract group参数中定义的userid用户
SQL> grant GGS_GGSUSER_ROLE to maclean;
Grant succeeded.

SQL> @ddl_enable
Trigger altered.

/*ddl_enable.sql将正式enable ddl捕获触发器,即:ALTER TRIGGER sys .&ddl_trigger_name ENABLE; */

SQL> @?/rdbms/admin/dbmspool
Package created.
Grant succeeded.
View created.
Package body created.

/* 执行dbmspool包将在数据库中创建DBMS_SHARED_POOL包,之后需要用到*/

SQL> @ddl_pin
Enter value for 1: maclean
PL/SQL procedure successfully completed.
Enter value for 1: maclean
PL/SQL procedure successfully completed.
PL/SQL procedure successfully completed.

/* ddl_pin.sql通过dbms_shared_pool.keep存储过程将DDLReplication相关的对象keep在共享池中,
   以保证这些对象不要reload,提升性能
*/

以上脚本都运行完成后,DDL语句同步的先题条件就达成了。接着我们要来配置GG部分的extract和replicat:

[maclean@rh2 gg]$ ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 11.1.1.0.0 Build 078
Linux, x64, 64bit (optimized), Oracle 10 on Jul 28 2010 13:21:11

Copyright (C) 1995, 2010, Oracle and/or its affiliates. All rights reserved.

GGSCI (rh2.oracle.com) 1> add extract load1,tranlog,begin now
EXTRACT added.

GGSCI (rh2.oracle.com) 2> add rmttrail /s01/rmt/ma,megabytes 100,extract load1
RMTTRAIL added.

GGSCI (rh2.oracle.com) 5> encrypt password maclean
No key specified, using default key...

Encrypted password:  AACAAAAAAAAAAAHANCUEFHPAVCYDNJVD

GGSCI (rh2.oracle.com) 16> start extract load1

Sending START request to MANAGER ...
EXTRACT LOAD1 starting

GGSCI (rh2.oracle.com) 19> view params load1

extract load1
userid maclean, password AACAAAAAAAAAAAHANCUEFHPAVCYDNJVD,encryptkey default
RMTHOST rh3.oracle.com, MGRPORT 7809
RMTTRAIL /s01/rmt/ma
DDL INCLUDE MAPPED
Table sender.*;

/* 以上为源端的extract load1的配置,采用了DDL INCLUDE MAPPED的DDL同步方式 */

GGSCI (rh3.oracle.com) 59> add replicat rep1,exttrail /s01/rmt/ma,begin now,checkpointtable maclean.checkpoint
REPLICAT added.

GGSCI (rh3.oracle.com) 62> view params rep1

replicat rep1
userid maclean,password maclean
ASSUMETARGETDEFS
discardfile /s01/discard/rep1.log,append,megabytes 10
-- Support DDL here
DDL
map sender.* , target receiver.*;

GGSCI (rh3.oracle.com) 60> start replicat rep1

Sending START request to MANAGER ...
REPLICAT REP1 starting

GGSCI (rh3.oracle.com) 63> info all          

Program     Status      Group       Lag           Time Since Chkpt

MANAGER     RUNNING
REPLICAT    RUNNING     REP1        00:00:00      00:00:07    

/* 以上为目标端的replicat rep1配置,DDL语句将由sender用户映射到receiver用户 */

接下来我们对具体的DDL同步进行测试:

/ * 在源端建立测试用表tddl */

SQL> conn sender/sender
Connected.
SQL> create table tddl (t1 int);
Table created.

REP1.rep报告文件出现以下信息:
Opened trail file /s01/rmt/ma000000 at 2010-12-07 03:50:19
2010-12-07 03:52:13  INFO    OGG-01407  Setting current schema for DDL operation to [RECEIVER].
2010-12-07 03:52:13  INFO    OGG-01408  Restoring current schema for DDL operation to [MACLEAN].

SQL> conn receiver/receiver;
Connected.

SQL> desc tddl
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 T1                                                 NUMBER(38)

 /* create table的DDL语句被正确同步了 */

/* 可以从gg管理者名下的ggs_marker表看到捕获到的建表语句
   似乎可以分为DDL正文和变量2段
*/

SQL> set linesize 65 pagesize 1400;
SQL> select marker_text
  2    from ggs_marker
  3   where seqno =
       (select seqno from ggs_marker where marker_text like '%tddl%');
  4
MARKER_TEXT
-----------------------------------------------------------------
,C1='create table tddl \(t1 int\) ',
,C5='16',,B2='',,G4='',,B3='SENDER',,B4='TDDL',,C12='',,C13='',,B
5='TABLE',,B6='CREATE',,B7='16',,B8='MACLEAN.GGS_DDL_HIST',,B9='S
ENDER',,C7='10.2.0.4.0',,C8='10.2.0.3',,C9='',,C10='1',,C11='clin
ica',,G3='NONUNIQUE',,C14='NO',,C17('1')='NLS_LANGUAGE',,C18('1')
='AMERICAN',,C17('2')='NLS_TERRITORY',,C18('2')='AMERICA',,C17('3
')='NLS_CURRENCY',,C18('3')='$',,C17('4')='NLS_ISO_CURRENCY',,C18
('4')='AMERICA',,C17('5')='NLS_NUMERIC_CHARACTERS',,C18('5')='.\,
',,C17('6')='NLS_CALENDAR',,C18('6')='GREGORIAN',,C17('7')='NLS_D
ATE_FORMAT',,C18('7')='DD-MON-RR',,C17('8')='NLS_DATE_LANGUAGE',,
C18('8')='AMERICAN',,C17('9')='NLS_SORT',,C18('9')='BINARY',,C17(
'10')='NLS_TIME_FORMAT',,C18('10')='HH.MI.SSXFF AM',,C17('11')='N
LS_TIMESTAMP_FORMAT',,C18('11')='DD-MON-RR HH.MI.SSXFF AM',,C17('
12')='NLS_TIME_TZ_FORMAT',,C18('12')='HH.MI.SSXFF AM TZR',,C17('1
3')='NLS_TIMESTAMP_TZ_FORMAT',,C18('13')='DD-MON-RR HH.MI.SSXFF A
M TZR',,C17('14')='NLS_DUAL_CURRENCY',,C18('14')='$',,C17('15')='
NLS_COMP',,C18('15')='BINARY',,C17('16')='NLS_LENGTH_SEMANTICS',,
C18('16')='BYTE',,C17('17')='NLS_NCHAR_CONV_EXCP',,C18('17')='FAL
SE',,C19='17',

SQL> conn sender/sender
Connected.
SQL> insert into tddl values(&a);
Enter value for a: 1
old   1: insert into tddl values(&a)
new   1: insert into tddl values(1)

1 row created.

SQL> /
Enter value for a: 2
old   1: insert into tddl values(&a)
new   1: insert into tddl values(2)

1 row created.

SQL> commit;

Commit complete.

SQL>  alter table tddl add tclob clob default 'I am testing';
Table altered.

/* 我们来看看添加有默认值的clob字段这样的DDL语句是否能够同步成功 */

SQL> conn receiver/receiver
Connected.
SQL> set linesize 100 pagesize 1400;
SQL> select * from tddl;

        T1 TCLOB
---------- --------------------------------------------------------------------------------
         1 I am testing
         2 I am testing
/* 以上测试表明gg成功复制了添加有默认值的clob字段这样的DDL语句 */

我们可以从源端的$ORACLE_BASE/admin/$sid/udump目录下的ggs_ddl_trace.log日志文件中找到以下记录:
SESS 154688-2010-12-06 19:08:42 : DDL : ************************* 
Start of log for DDL sequence [18], v[ $Id: //depot/releases/OpenSys/v11.1.1.0/redist/SQLScripts/ddl_
setup.sql#2 $ ] trace level [0], owner schema of DDL package [MACLEAN], objtype [TABLE] 
name [SENDER.TDDL]
SESS 154688-2010-12-06 19:08:42 : DDLTRACE1 : Before Trigger: point in 
execution = [1.0], objtype [TABLE] name [SENDER.TDDL]
SESS 154688-2010-12-06 19:08:42 : DDL : DDL operation 
[ alter table tddl add tclob clob default 'I am testing' ], sequence [18], 
DDL type [ALTER] TABLE, real object ty
pe [TABLE], validity [VALID], object ID [57772], object [SENDER.TDDL], 
real object [SENDER.TDDL], base object schema [], base object name [], logged as [SENDER]
SESS 154688-2010-12-06 19:08:42 : DDL : Start SCN found [45357217]
SESS 154688-2010-12-06 19:08:42 : DDL : ------------------------- End of log for DDL sequence [18]

之前有客户选用了Quest的shareplex复制软件来achive一个上海到北京的live standby环境,所以有机会参加了几次shareplex的切换演练,虽然我对shareplex十分陌生(直到现在也是)。在实际切换期间也多次听到shareplex的工程师反复强调复制环境中不能有BLOB/CLOB等大对象,也不能出现DDL语句(shareplex不能复制LOB和DDL吗,未经证实);可能该环境中的确很少变更,所以客户方的经理并不concern这一点:”我们不做DDL就是了嘛!” 这个复制环境因为数据库最初设计的时候没有考虑用主键或unique index且个别主用业务表上有巨多列(多于200个列)的原因导致后来add上追加日志后出现重做日志产生量水涨船高的问题。

不过相比起goldengate有点惊人的价格来说(如果你需要data repair功能,记得你还需要另外购买veridata),似乎shareplex对于中国客户还是有着不少的优势。 shareplex还是goldengate, 这是个问题。

TSM配置导致RMAN备份expired

下午协助客户做异机备份恢复测试,平台是AIX 5.3使用tsm备份。在源生产机上备份了数据库和控制文件,准备从磁带库上恢复到测试机时RMAN报找到不备份的控制文件,之后crosscheck了一把:
[Read more…]

Dropping Very Large Table In Oracle

这是一张550G的大表,表上还包括了CLOB和BLOB对象;我们来观察下Oracle drop这样一个大表时的具体表现:

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

SQL> show parameter recyclebin

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
recyclebin                           string      OFF

/* 为了避免被flashback table骚扰,关闭了recyclebin回收站功能 */

SQL> conn maclean/maclean
Connected.

SQL> col segment_name for a20
SQL> select segment_name,bytes/1024/1024/1024 "size in Gbytes"  from user_segments where segment_name='TV';

SEGMENT_NAME         size in Gbytes
-------------------- --------------
TV                           547.25

SQL> select NUM_ROWS,BLOCKS from user_tables where table_name='TV';

  NUM_ROWS     BLOCKS
---------- ----------
 859150100   65649786

SQL> desc tv;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 OWNER                                              VARCHAR2(30)
 OBJECT_NAME                                        VARCHAR2(128)
 SUBOBJECT_NAME                                     VARCHAR2(30)
 OBJECT_ID                                          NUMBER
 DATA_OBJECT_ID                                     NUMBER
 OBJECT_TYPE                                        VARCHAR2(19)
 CREATED                                            DATE
 LAST_DDL_TIME                                      DATE
 TIMESTAMP                                          VARCHAR2(19)
 STATUS                                             VARCHAR2(7)
 TEMPORARY                                          VARCHAR2(1)
 GENERATED                                          VARCHAR2(1)
 SECONDARY                                          VARCHAR2(1)
 SPARE1                                             CLOB
 SPARE2                                             CLOB
 SPARE3                                             CLOB
 SPARE4                                             BLOB

/* 该大表包含CLOB、BLOB2种大对象,共859150100行数据,占用65649786个块
    其所在是一个大文件表空间(bigfile tablespace),本地区间管理方式,区间大小统一为128MB 
*/

SQL> col tablespace_name for a2
SQL> select relative_fno,header_block,owner,tablespace_name from dba_segments where segment_name='TV';
RELATIVE_FNO HEADER_BLOCK OWNER                          TA
------------ ------------ ------------------------------ --
        1024           82 MACLEAN                        BF

/* 因为是用bigfile tablespace技术,故数据段所在相对数据文件号为1024 */


SQL> col segment_name for a30
SQL> col owner for a10
SQL> select owner,segment_name,segment_type,header_block from dba_segments where relative_fno=1024;

OWNER      SEGMENT_NAME                   SEGMENT_TYPE       HEADER_BLOCK
---------- ------------------------------ ------------------ ------------
MACLEAN    TV                             TABLE                        82
MACLEAN    SYS_IL0000057409C00014$$       LOBINDEX                  32850
MACLEAN    SYS_IL0000057409C00015$$       LOBINDEX                  65618
MACLEAN    SYS_IL0000057409C00016$$       LOBINDEX                  98386
MACLEAN    SYS_IL0000057409C00017$$       LOBINDEX                 131154
MACLEAN    SYS_LOB0000057409C00014$$      LOBSEGMENT                16466
MACLEAN    SYS_LOB0000057409C00015$$      LOBSEGMENT                49234
MACLEAN    SYS_LOB0000057409C00016$$      LOBSEGMENT                82002
MACLEAN    SYS_LOB0000057409C00017$$      LOBSEGMENT               114770

9 rows selected.

/* 该bigfile数据文件1024上的数据段如上包括TV表本身以及LOB对象和LOB索引 */

SQL> select current_scn from v$database;
CURRENT_SCN
-----------
   44989856

/* 获取当前scn以便闪回数据库 */

SQL> alter session set events '10046 trace name context forever,level 12';
Session altered.

SQL> set timing on;

SQL> drop table tv;
Table dropped.
Elapsed: 00:00:01.21  

/* 虽然是550G的大表,但drop也仅耗时1.21s再次证明了drop仅仅是修改数据字典 */

通过tkprof分析的trace文件信息:
drop table tv

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.12       0.29         28          9      30163           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.12       0.29         28          9      30163           0

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 64

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file sequential read                        28        0.00          0.00
  rdbms ipc reply                                10        0.01          0.06
  reliable message                                8        0.00          0.00
  enq: RO - fast object reuse                     8        0.00          0.00
  write complete waits                            9        0.04          0.10
  log file sync                                   1        0.00          0.00
  SQL*Net message to client                       1        0.00          0.00
  SQL*Net message from client                     1        7.22          7.22   

以上可以看到少量的等待事件,drop要求所被drop对象先做object对象级别的checkpoint检查点,以便将脏块写出;故而会出现Write complete waits(A user wants to modify a block that is part of DBWRÂ’s current write batch. When DBWR grabs buffers to write, it marks them as ‘being written’. All the collected buffers are then written to disk. The wait ‘write complete waits’ implies we wanted a buffer while this flag was set. The flags are cleared as each buffer is written)。同时要求获取enq: RO – fast object reuse(快速重用对象队列锁)。

整个drop流程还包括以下修改数据字典的操作:

delete from object_usage
 where obj# in (select a.obj#
                  from object_usage a, ind$ b
                 where a.obj# = b.obj#
                   and b.bo# = :1)
/*  删除该表在对象使用情况基表(object_usage)中的纪录 */

delete from sys.cache_stats_1$ where dataobj# = :1
/* Jonathan Lewis的CBO fundamentals对该基本做了寥寥几笔的描述,该基表用于监控缓存统计信息  
    --which is used to monitor lifetime caching statistics 
*/
delete com$ where obj# = :1
/* 删除对象相关的comment,com$ --comment table */

delete from hist_head$ where obj# = :1
delete from histgrm$ where obj# = :1
delete from dependency$ where d_obj# = :1
delete from source$ where obj# = :1

delete from idl_ub1$
  where obj# = :1
    and part = :2

delete from idl_char$
  where obj# = :1
    and part = :2

delete from idl_ub2$
  where obj# = :1
    and part = :2

delete from ncomp_dll$ where obj# = :1 returning dllname into :2

delete from idl_sb4$
 where obj# = :1
   and part = :2

delete from objauth$ where obj# = :1
delete from col$ where obj# = :1
delete from icol$ where bo# = :1
delete from icoldep$ where obj# in (select obj# from ind$ where bo# = :1)

delete from jijoin$
  where obj# in (select obj#
                   from jijoin$
                  where tab1obj# = :1
                     or tab2obj# = :1)

delete from jirefreshsql$
  where iobj# in (select iobj# from jirefreshsql$ where tobj# = :1)

delete from ccol$ where obj# = :1
delete from ind$ where bo# = :1
delete from cdef$ where obj# = :1
delete from tab$ where obj# = :1
delete coltype$ where obj# = :1
delete from subcoltype$ where obj# = :1
delete ntab$ where obj# = :1
delete lob$ where obj# = :1
delete refcon$ where obj# = :1
delete from opqtype$ where obj# = :1

之后将出现多次update seg$(数据段基表)的操作:


 update seg$ set type#=:4,blocks=:5,extents=:6,minexts=:7,maxexts=:8,extsize=
  :9,extpct=:10,user#=:11,iniexts=:12,lists=decode(:13, 65535, NULL, :13),
  groups=decode(:14, 65535, NULL, :14), cachehint=:15, hwmincr=:16, spare1=
  DECODE(:17,0,NULL,:17),scanhint=:18 

/* 通过将trace文件中的bind variable还原可以看到Oracle实际执行的update语句 */

update seg$
   set type#     = 3,
       blocks    = 16384,
       extents   = 1,
       minexts   = 1,
       maxexts   = 2147483645,
       extsize   = 16384,
       extpct    = 0,
       user#     = 64,
       iniexts   = 16384,
       lists     = decode(0, 65535, NULL, :13),
       groups    = decode(0, 6 5535, NULL, :14),
       cachehint = :0,
       hwmincr   = 57411,
       spare1    = DECODE(164161, 0, NULL, :17),
       scanhint  = 0
 where ts# = 12
   and file# = 1024
   and block# = 82

/*  以上update语句被多次执行,每次变化的仅有block#变量,依次为114770,82002,49234...82,
     与上述的1024RFN数据文件上的各数据段的header_block头块对应 
*/

/*  由此可知drop过程中Oracle所需要做的是对段在seg$基表上的纪录做修改,
     将type由原值修改为3,也就是临时段的标记;同时extents重置为1。
*/

sql.bsq文件纪录了seg$基表上type#列的含义:
/*  1 = UNDO, 2 = SAVE UNDO, 3 = TEMPORARY, 4 = CACHE, 5 = DATA, 6 = INDEX 
     7 = SORT  8 = LOB   9 = Space Header 10 = System Managed Undo      */

delete from obj$ where obj# = :1
/*  每次update完成后,都会伴随有以上删除obj$基表中对应对象纪录的语句 */

delete from seg$ where ts#=:1 and file#=:2 and block#=:3
/* 若关闭了recyclebin(回收站,flashback table闪回表特性),则在最后此前台进程还会删除已经
    被置为type#=3的临时段在seg$字典基表上的纪录;如果打开了回收站则不会有此delete from seg$操作;
*/

SQL> select block#,type#,extents from sys.seg$  where file#=1024;
no rows selected

可以看到drop数据表作为一种DDL语句,其所需要完成的主要工作是完整删除数据字典中该表相关信息,并在seg$字典基表上将原有段类型(type)和大小修改为临时段和仅有1个区间,之后在obj$对象字典基表上将该对象的纪录彻底删除,此外还将释放数据文件头的区间位图信息。

SQL> alter system flush buffer_cache;
System altered.

SQL> alter system dump datafile '/g01/bf.dbf' block 3;
System altered.

File Space Bitmap Block:
BitMap Control:
RelFno: 1024, BeginBlock: 17, Flag: 0, First: 0, Free: 63488
0000000000000000 0000000000000000 0000000000000000 0000000000000000

Oracle等待事件:Data file init write

在给某一个大表加有default值的clob列时出现了Data file init write等待事件,这个等待事件是10gR2中新加入的,恰恰10gr2的文档(乃至11g的文档)都没有列出该等待事件。该等待事件一般在Oracle自动扩展数据文件(auto extend datafile)并串行地格式化数据文件的新创建的空间时作为前台(foreground)等待事件出现:

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

SQL> show parameter compatible

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
compatible                           string      10.2.0.3

/* 数据库版本10.2.0.4,compatible为10.2.0.3 */

SQL> alter table tv add spare6 clob default '........';
SQL> col event for a30;

SQL> select event, time_waited, current_file#, current_block#
  2    from V$active_Session_History
  3   where event like 'Data file init write'
  4   order by sample_time desc;

EVENT                          TIME_WAITED CURRENT_FILE# CURRENT_BLOCK#
------------------------------ ----------- ------------- --------------
Data file init write                533810             4          54783
Data file init write                442042             4          54783
Data file init write                 47286             4          54783
Data file init write                 42932             4          54783
Data file init write                413477             4          54783
Data file init write                153503             4          54783
Data file init write                 33520             4          54783
Data file init write                307616             4          54783
Data file init write                214404             4          54783
Data file init write                  3400             4          54783
Data file init write                212386             4          54783
Data file init write                192631             4          54783
..............
Data file init write                611157             4          54775
Data file init write                  1379             4          54775
Data file init write                223541             4          54775

伴随着数据文件扩展(Data file init write等待事件的直接触发原因),Oracle需要通过以下递归dml语句维护相应的数据字典:

1.查询字典表file$确定表空间对应的数据文件号
select file# from file$ where ts#=:1
该操作可能伴随Data file init write等待事件发生:
PARSING IN CURSOR #3 len=36 dep=2 uid=0 oct=3 lid=0 tim=1261083587010014 hv=1570213724 ad='8f7d4210'
select file# from file$ where ts#=:1
END OF STMT
.............
WAIT #14: nam='Data file init write' ela= 1091 count=1 intr=256 timeout=4294967295 obj#=57314 
WAIT #14: nam='Data file init write' ela= 1078 count=1 intr=256 timeout=4294967295 obj#=57314 
WAIT #14: nam='Data file init write' ela= 1102 count=1 intr=256 timeout=4294967295 obj#=57314 
WAIT #14: nam='Data file init write' ela= 1156 count=1 intr=256 timeout=4294967295 obj#=57314 
WAIT #14: nam='Data file init write' ela= 1870 count=1 intr=256 timeout=4294967295 obj#=57314 
WAIT #14: nam='Data file init write' ela= 37 count=1 intr=256 timeout=4294967295 obj#=57314
WAIT #14: nam='Data file init write' ela= 4 count=4294967295 intr=32 timeout=2147483647 obj#=57314 

2.若使用bigfile tablespace则可能出现以下insert seg$流程
insert into seg$ (file#,block#,type#,ts#,blocks,extents,minexts,maxexts,
extsize,extpct,user#,iniexts,lists,groups,cachehint,bitmapranges,hwmincr, 
spare1, scanhint) values (:1,:2,:3,:4,:5,:6,:7,:8,:9,:10,:11,:12,:13,:14,:15,0,:16,
DECODE(:17,0,NULL,:17),:18)
PARSE #15:c=0,e=368,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,tim=1261083586909081
BINDS #15:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=2ba79a0e1330  bln=22  avl=03  flg=05
  value=1024

SQL> select file#,block#,blocks from sys.seg$ where file#=1024;

     FILE#     BLOCK#     BLOCKS
---------- ---------- ----------
      1024         82   13434880
      1024   13434962      16384
      1024   13451346      16384
      1024   13467730      16384
      1024   13484114      16384

/* 会出现奇怪的1024号文件 */

3.维护用户表空间限额字典数据
update tsq$
   set blocks    = :3,
       maxblocks = :4,
       grantor#  = :5,
       priv1     = :6,
       priv2     = :7,
       priv3     = :8
 where ts# = :1
   and user# = :2

4.更新seg$数据字典基表以扩展数据段
update seg$
   set type#     = :4,
       blocks    = :5,
       extents   = :6,
       minexts   = :7,
       maxexts   = :8,
       extsize   = :9,
       extpct    = :10,
       user#     = :11,
       iniexts   = :12,
       lists     = decode(:13, 65535, NULL, :13),
       groups    = decode(:14, 6 5535, NULL, :14),
       cachehint = :15,
       hwmincr   = :16,
       spare1    = DECODE(:17, 0, NULL, :17),
       scanhint  = :18
 where ts# = :1
   and file# = :2
   and block# = :3

以上数据文件空间扩展流程中只有查询语句”select file# from file$ where ts#=:1″伴随有”Data file init write”等待。

Goldengate一个令人郁闷的小细节

晚上有兴趣测试了下Goldengate的initial load功能,通过initial load+change sync可以很容易做到数据迁移data migration;不过发现一个令人很郁闷的细节,就是replicat的params中定义map参数时TARGET到前面一个逗号间要留一个空格space:

MAP table spec, TARGET table spec     
/*TARGET到前面一个逗号间要留一个空格space*/
[, DEF ]
[, TARGETDEF ]
[, COLMAP ()]
[, EVENTACTIONS ()]
[, EXCEPTIONSONLY]
[, EXITPARAM “”]
[, FILTER ()]
[, HANDLECOLLISIONS | NOHANDLECOLLISIONS]
[, INSERTALLRECORDS]
[, INSERTAPPEND | NOINSERTAPPEND]
[, KEYCOLS ()]
[, REPERROR ( , )]
[, SQLEXEC ()]
[, TRIMSPACES | NOTRIMSPACES]
[, WHERE ()]
;

如果map参数中target和前面的逗号间没有空格会出现invaild option for map错误:

GGSCI (rh2.oracle.com) 32> view params init1
extract init1
userid ggate,password ggate
RMTHOST rh3.oracle.com,MGRPORT 7809
RMTTASK replicat,GROUP  init3
table sh.products;

GGSCI (rh3.oracle.com) 40> view params init3

replicat init3
userid ggate,password ggate
ASSUMETARGETDEFS
map sh.products,target sh.products;

GGSCI (rh2.oracle.com) 33> start extract init1

Sending START request to MANAGER ...
EXTRACT INIT1 starting

GGSCI (rh3.oracle.com) 41> view report init3
..............
MAP resolved (entry SH.PRODUCTS):
  map SH.PRODUCTS, sh.products;

Source Context :
  SourceModule            : [er.main]
  SourceID                : [/home/ecloud/workspace/Build_FBO_OpenSys_r11.1.1.0.0_078_[34093]/perforce/src/app/er/rep.c]
  SourceFunction          : [get_map_entry]
  SourceLine              : [8573]
  ThreadBacktrace         : [11] elements
                          : [./replicat(CMessageContext::AddThreadContext()+0x26) [0x5d9516]]
                          : [./replicat(CMessageFactory::CreateMessage(CSourceContext*, unsigned int, ...)+0x7b2) [0x5cffb2]]
                          : [./replicat(_MSG_ERR_STARTUP_PARAMERROR_INVALID_OPTION_VALUE
(CSourceContext*, char const*, char const*, CMessageFactory::MessageDisposition)
+0x9b) [0x59086b]]
                          : [./replicat(get_map_entry(char*, int, __wc*, int)+0x284d) [0x4ee66d]]
                          : [./replicat [0x5357d5]]
                          : [./replicat(WILDCARD_check_table(char const*, char const*, int, 
unsigned int*, int, unsigned int, DBString<777>*, int)+0x15e) [0x536fce]]
                          : [./replicat(REP_find_source_file_wc(char const*, unsigned int, DBString<777>*, int)+0x64f) [0x8a1caf]]
                          : [./replicat [0x8a9b04]]
                          : [./replicat(main+0x227a) [0x4f50da]]
                          : [/lib64/libc.so.6(__libc_start_main+0xf4) [0x340d01d994]]
                          : [./replicat(__gxx_personality_v0+0x1e2) [0x4d86ba]]

2010-12-01 08:58:30  ERROR   OGG-00212  Invalid option for MAP: sh.products.

/*我们来加上空格*/

GGSCI (rh3.oracle.com) 43> view params init3
replicat init3
userid ggate,password ggate
ASSUMETARGETDEFS
map sh.products, target sh.products;

GGSCI (rh3.oracle.com) 43> view report init3
MAP resolved (entry SH.PRODUCTS):
  map SH.PRODUCTS, target sh.products;

2010-12-01 09:00:30  WARNING OGG-00869  No unique key is defined for table PRODUCTS. 
All viable columns will be used to represent the key, but may not guarantee uniqueness.  
KEYCOLS may be used to define the key.
Using following columns in default map by name:
  PROD_ID, PROD_NAME, PROD_DESC, PROD_SUBCATEGORY, PROD_SUBCATEGORY_ID, 
  PROD_SUBCATEGORY_DESC, PROD_CATEGORY, PROD_CATEGORY_ID, 
  PROD_CATEGORY_DESC, PROD_WEIGHT_CLASS, PROD_UNIT_OF_MEASURE, 
  PROD_PACK_SIZE, SUPPLIER_ID, PROD_STATUS, PROD_LIST_PRICE, 
  PROD_MIN_PRICE, PROD_TOTAL, PROD_TOTAL_ID, PROD_SRC_ID, 
  PROD_EFF_FROM, PROD_EFF_TO, PROD_VALID

/* 加上空格后初始化成功*/

ora-00600[kkocxj:pjpCtx]内部错误一例

一套HP-UX上的10.2.0.4系统在运行某条 select查询语句时出现ORA-00600[kkocxj:pjpCtx]内部错误,TRACE文件信息如下:


FILE VERSION
------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /oracle/app/oracle/product/10.2
System name:    HP-UX
Node name:      crmdb1
Release:        B.11.31
Version:        U
Machine:        ia64
Instance name: cbssnm
Redo thread mounted by this instance: 1

TRACE FILE
---------------
Filename = cbssnm_ora_29061.trc

*** ACTION NAME:(SQL 窗口 - 新建) 2010-07-02 15:59:46.238
*** MODULE NAME:(PL/SQL Developer) 2010-07-02 15:59:46.238
*** SERVICE NAME:(SYS$USERS) 2010-07-02 15:59:46.238
*** SESSION ID:(770.4341) 2010-07-02 15:59:46.237
*** 2010-07-02 15:59:46.237
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kkocxj : pjpCtx], [], [], [], [], [], [], []
Current SQL statement for this session:
select p.access_number, aa.name
 from crm.product p,
      (select aa.prod_id, os.name, os.staff_number
         from (select *
                 from (select prod_id,
                              party_id,
                              row_number() over(partition by prod_id order by start_dt desc) num
                         from crm.party_2_prod
                        where end_dt > sysdate
                          and party_product_rela_role_cd = 3)
                where num = 1) aa,
              crm.our_staff os
        where aa.party_id = os.staff_id) aa
where p.prod_id = aa.prod_id(+)
  and p.access_number = '15335581126'
----- Call Stack Trace -----
    ksedst <- ksedmp <- ksfdmp <- kgerinv <- kgeasnmierr        
<- $cold_kkocxj <- kkoiqb <- kkooqb <- kkoqbc <- apakkoqb         
<- apaqbdDescendents <- apaqbd <- kkqctCostTransfQB <- kkqctdrvJP 
<- kkqjpdttr          <- kkqctdrvTD <- kkqjpddrv <- kkqdrv <- kkqctdrvIT 
<- apadrv           <- opitca <- kksFullTypeCheck <- rpiswu2 <- kksLoadChild 
<- kxsGetRuntimeLock            <- kksfbc <- kkspsc0 <- kksParseCursor 
<- opiosq0 <- kpooprx             <- kpoal8 <- opiodr <- ttcpip <- opitsk 
<- opiino              <- opiodr <- opidrv <- sou2o <- opimai_real <- main               
<- main_opd_entry

根据错误代码和stack trace可以在metalink上匹配到如下Bug:

Bug:7014646
Abstract: ORA-600: INTERNAL ERROR CODE, ARGUMENTS: [KKOCXJ : PJPCTX], [], [], [], [], []
Affects:
    Product (Component)	Oracle Server (Rdbms)
    Range of versions believed to be affected	Versions < 11.2
    Versions confirmed as being affected	
        * 10.2.0.4
        * 11.1.0.6 
    Platforms affected	Generic (all / most platforms affected)
Fixed:
    This issue is fixed in	
        * 10.2.0.4 Patch 7 on Windows Platforms
        * 10.2.0.5 (Server Patch Set)
        * 11.1.0.7 (Server Patch Set)
        * 11.2 (Future Release) 
Symptoms:
    * Internal Error May Occur (ORA-600)
    * ORA-600 [kkocxj : pjpCtx] 
Related To:
    * Optimizer
    * _OPTIMIZER_PUSH_PRED_COST_BASED 
Description
    A complex query can fail during parse with 
    ORA-600 [kkocxj : pjpCtx]
    Workaround
     Set  "_optimizer_push_pred_cost_based"=false

该bug可以通过实施one off Patch 7014646修复,也可以尝试通过修改隐式参数_optimizer_push_pred_cost_based禁用基于成本的谓词前置特性(WORKAROUND: disable cost based push predicate)来规避该[KKOCXJ:PJPCTX]内部错误发生,具体的修改方法:

SQL> conn / as sysdba
SQL> alter system set "_optimizer_push_pred_cost_based"=false;
SQL> exit
/* 设置该隐式参数无需重启实例 */

Oracle GCS更推荐通过应用补丁7014646的方法来解决问题,而修改以上隐式参数则不一定百分之百能解决问题。

Goldengate单向大事务复制性能测试

Goldengate单向数据复制,为了图省事没有配pump,只配了source的extract和target的replicat;实际操作发现gg对大事务的支持还是比streams好一些,streams碰上大事务就只看到SPILL MESSAGES了:

/*源端配置信息*/
GGSCI (rh2.oracle.com) 1> view params mgr

PORT 7809


GGSCI (rh2.oracle.com) 2> view params load1


--extract group--
extract load1
--connection to database--
userid ggate, password ggate
--hostname and port for trail--
rmthost rh3.oracle.com, mgrport 7809
--path and name for trail--
rmttrail /s01/gg
ddl include mapped objname sender.*
table sender.*;

GGSCI (rh2.oracle.com) 4> stats extract load1

Sending STATS request to EXTRACT LOAD1 ...

Start of Statistics at 2010-11-29 17:44:41.

DDL replication statistics (for all trails):

*** Total statistics since extract started     ***
        Operations                                   1.00
        Mapped operations                            1.00
        Unmapped operations                          0.00
        Other operations                             0.00
        Excluded operations                          0.00

Output to /s01/gg:

Extracting from GGATE.GGS_MARKER to GGATE.GGS_MARKER:

*** Total statistics since 2010-11-29 16:22:25 ***

        No database operations have been performed.

*** Daily statistics since 2010-11-29 16:22:25 ***

        No database operations have been performed.

*** Hourly statistics since 2010-11-29 17:00:00 ***

        No database operations have been performed.

*** Latest statistics since 2010-11-29 16:22:25 ***

        No database operations have been performed.

Extracting from SENDER.ABC to SENDER.ABC:

*** Total statistics since 2010-11-29 16:22:25 ***
        Total inserts                           2621440.00
        Total updates                                0.00
        Total deletes                                0.00
        Total discards                               0.00
        Total operations                        2621440.00

*** Daily statistics since 2010-11-29 16:22:25 ***
        Total inserts                           2621440.00
        Total updates                                0.00
        Total deletes                                0.00
        Total discards                               0.00
        Total operations                        2621440.00

*** Hourly statistics since 2010-11-29 17:00:00 ***

        No database operations have been performed.

*** Latest statistics since 2010-11-29 16:22:25 ***
        Total inserts                           2621440.00
        Total updates                                0.00
        Total deletes                                0.00
        Total discards                               0.00
        Total operations                        2621440.00

End of Statistics.

GGSCI (rh2.oracle.com) 5> info all

Program     Status      Group       Lag           Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     RUNNING     LOAD1       00:00:00      00:00:08    

/*目标端配置信息*/

GSCI (rh3.oracle.com) 1> view params mgr

PORT 7809
USERID ggate, PASSWORD ggate
PURGEOLDEXTRACTS /s01/gg


GGSCI (rh3.oracle.com) 2> view params rep1

--Replicat group --
replicat rep1
--source and target definitions
ASSUMETARGETDEFS
--target database login --
userid ggate, password ggate
--file for dicarded transaction --
discardfile /s01/discard.txt, append, megabytes 10
--ddl support
DDL
--Specify table mapping ---
map sender.*, target receiver.*;

GGSCI (rh3.oracle.com) 3> stats replicat rep1

Sending STATS request to REPLICAT REP1 ...

Start of Statistics at 2010-11-30 02:44:16.

DDL replication statistics:

*** Total statistics since replicat started     ***
        Operations                                   1.00
        Mapped operations                            1.00
        Unmapped operations                          0.00
        Other operations                             0.00
        Excluded operations                          0.00
        Errors                                       0.00
        Retried errors                               0.00
        Discarded errors                             0.00
        Ignored errors                               0.00

Replicating from SENDER.ABC to RECEIVER.ABC:

*** Total statistics since 2010-11-30 01:21:09 ***
        Total inserts                           2621440.00
        Total updates                                0.00
        Total deletes                                0.00
        Total discards                               0.00
        Total operations                        2621440.00

*** Daily statistics since 2010-11-30 01:21:09 ***
        Total inserts                           2621440.00
        Total updates                                0.00
        Total deletes                                0.00
        Total discards                               0.00
        Total operations                        2621440.00

*** Hourly statistics since 2010-11-30 02:00:00 ***

        No database operations have been performed.

*** Latest statistics since 2010-11-30 01:21:09 ***
        Total inserts                           2621440.00
        Total updates                                0.00
        Total deletes                                0.00
        Total discards                               0.00
        Total operations                        2621440.00

End of Statistics.

GGSCI (rh3.oracle.com) 4> info all

Program     Status      Group       Lag           Time Since Chkpt

MANAGER     RUNNING                                           
REPLICAT    RUNNING     REP1        00:00:00      00:00:01   

接着尝试在源库端sender模式下产生一个大事务操作,观察目标段数据同步情况*/

SQL> delete sender.abc;
2621440 rows deleted.

SQL> commit;
Commit complete.
/* commit操作完成前,EXTRACT进程并不会抽取日志 */

Goldengate日志ggserror.log的相关记录:
010-11-29 17:49:49  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000029.
2010-11-29 17:49:50  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000030.
2010-11-29 17:49:51  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000031.
2010-11-29 17:49:53  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000032.
2010-11-29 17:49:54  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000033.
2010-11-29 17:49:55  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000034.
2010-11-29 17:49:56  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000035.
2010-11-29 17:49:57  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000036.
2010-11-29 17:49:58  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000037.
2010-11-29 17:49:59  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000038.
2010-11-29 17:50:00  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000039.
2010-11-29 17:50:01  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000040.
2010-11-29 17:50:03  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000041.
2010-11-29 17:50:04  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000042.
2010-11-29 17:50:05  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000043.
2010-11-29 17:50:06  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000044.
2010-11-29 17:50:07  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000045.
2010-11-29 17:50:08  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000046.
2010-11-29 17:50:09  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000047.
2010-11-29 17:50:10  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000048.
2010-11-29 17:50:11  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000049.
2010-11-29 17:50:13  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000050.
2010-11-29 17:50:14  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000051.
2010-11-29 17:50:15  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000052.
2010-11-29 17:50:16  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000053.
2010-11-29 17:50:17  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, load1.prm:  Rolling over remote file /s01/gg000054.

/*产生了26个trail文件*/

/*同时备库端开始陆续应用日志*/
REP1.rpt是rep1 replicat复制进程的文本报告,出现相关内容:
Switching to next trail file /s01/gg000029 at 2010-11-30 02:47:54 due to EOF, with current RBA 9999949
Opened trail file /s01/gg000029 at 2010-11-30 02:47:54

Switching to next trail file /s01/gg000030 at 2010-11-30 02:49:53 due to EOF, with current RBA 9999925
Opened trail file /s01/gg000030 at 2010-11-30 02:49:53

GGSCI (rh3.oracle.com) 6> stats replicat rep1

Sending STATS request to REPLICAT REP1 ...

Start of Statistics at 2010-11-30 02:52:20.

DDL replication statistics:

*** Total statistics since replicat started     ***
        Operations                                   1.00
        Mapped operations                            1.00
        Unmapped operations                          0.00
        Other operations                             0.00
        Excluded operations                          0.00
        Errors                                       0.00
        Retried errors                               0.00
        Discarded errors                             0.00
        Ignored errors                               0.00

Replicating from SENDER.ABC to RECEIVER.ABC:

*** Total statistics since 2010-11-30 01:21:09 ***
        Total inserts                           2621440.00
        Total updates                                0.00
        Total deletes                           208150.00
        Total discards                               0.00
        Total operations                        2829590.00

*** Daily statistics since 2010-11-30 01:21:09 ***
        Total inserts                           2621440.00
        Total updates                                0.00
        Total deletes                           208150.00
        Total discards                               0.00
        Total operations                        2829590.00

*** Hourly statistics since 2010-11-30 02:00:00 ***
        Total inserts                                0.00
        Total updates                                0.00
        Total deletes                           208150.00
        Total discards                               0.00
        Total operations                        208150.00

*** Latest statistics since 2010-11-30 01:21:09 ***
        Total inserts                           2621440.00
        Total updates                                0.00
        Total deletes                           208150.00
        Total discards                               0.00
        Total operations                        2829590.00

End of Statistics.


GGSCI (rh3.oracle.com) 7> !
stats replicat rep1

Sending STATS request to REPLICAT REP1 ...

Start of Statistics at 2010-11-30 02:52:26.

DDL replication statistics:

*** Total statistics since replicat started     ***
        Operations                                   1.00
        Mapped operations                            1.00
        Unmapped operations                          0.00
        Other operations                             0.00
        Excluded operations                          0.00
        Errors                                       0.00
        Retried errors                               0.00
        Discarded errors                             0.00
        Ignored errors                               0.00

Replicating from SENDER.ABC to RECEIVER.ABC:

*** Total statistics since 2010-11-30 01:21:09 ***
        Total inserts                           2621440.00
        Total updates                                0.00
        Total deletes                           210767.00
        Total discards                               0.00
        Total operations                        2832207.00

*** Daily statistics since 2010-11-30 01:21:09 ***
        Total inserts                           2621440.00
        Total updates                                0.00
        Total deletes                           210767.00
        Total discards                               0.00
        Total operations                        2832207.00

*** Hourly statistics since 2010-11-30 02:00:00 ***
        Total inserts                                0.00
        Total updates                                0.00
        Total deletes                           210767.00
        Total discards                               0.00
        Total operations                        210767.00

*** Latest statistics since 2010-11-30 01:21:09 ***
        Total inserts                           2621440.00
        Total updates                                0.00
        Total deletes                           210767.00
        Total discards                               0.00
        Total operations                        2832207.00

End of Statistics.


/*由以上对replicat进程的统计信息可知其正在应用delete操作*/

Switching to next trail file /s01/gg000031 at 2010-11-30 02:53:26 due to EOF, with current RBA 9999925
Opened trail file /s01/gg000031 at 2010-11-30 02:53:26

Switching to next trail file /s01/gg000032 at 2010-11-30 02:58:28 due to EOF, with current RBA 9999925
Opened trail file /s01/gg000032 at 2010-11-30 02:58:28

[maclean@rh3 s01]$ ls -lh gg00003[1-2]
-rw-rw-rw- 1 maclean oinstall 9.6M Nov 30 02:47 gg000031
-rw-rw-rw- 1 maclean oinstall 9.6M Nov 30 02:47 gg000032

/*由上列trace信息推断,以这样一台双核cpu主频为2.2GHZ的pc机为例,gg应用一个大小为9.6M的trail文件耗时也要将近5分钟(当然这并不准确)。


/*从进程argument可以看到replicate和extract进程的本质是调用了多个配置文件的$GGATE/extract和$GGATE/replicat*/
[maclean@rh3 ~]$ ps -ef|grep repl
maclean   7817  7476  3 01:18 ?        00:03:29 
/home/maclean/gg/replicat PARAMFILE /home/maclean/gg/dirprm/rep1.prm 
REPORTFILE /home/maclean/gg/dirrpt/REP1.rpt 
PROCESSID REP1 USESUBDIRS

[maclean@rh2 dirrpt]$ ps -ef|grep ex
maclean    544 32432  1 16:21 ?        00:01:12 
/home/maclean/gg/extract PARAMFILE /home/maclean/gg/dirprm/load1.prm 
REPORTFILE /home/maclean/gg/dirrpt/LOAD1.rpt 
PROCESSID LOAD1 USESUBDIRS

ORA-00600: [qksrcBuildRwo]内部错误一例

一套HP-UX上的11.1.0.7上的系统,在使用11g自带的新特性dictionary health check(数据字典健康检查)功能时发现FILE$基表存在讹误,并且告警日志中伴随出现ORA-00600:[qksrcBuildRwo]内部错误,具体错误信息如下:

RMAN> list failure;

List of Database Failures
=========================

Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
91337 CRITICAL OPEN 30-JUN-09 SQL dictionary health check: file$ pk 202 on object FILE$ failed
91334 CRITICAL OPEN 30-JUN-09 SQL dictionary health check: file$ pk 202 on object FILE$ failed
91331 CRITICAL OPEN 30-JUN-09 SQL dictionary health check: file$ pk 202 on object FILE$ failed
91328 CRITICAL OPEN 30-JUN-09 SQL dictionary health check: file$ pk 42 on object FILE$ failed
91325 CRITICAL OPEN 30-JUN-09 SQL dictionary health check: file$ pk 42 on object FILE$ failed
91322 CRITICAL OPEN 30-JUN-09 SQL dictionary health check: file$ pk 42 on object FILE$ failed

ALERT LOG
-----------------------
Display of database log file :

Tue Jun 30 10:32:27 2009
Errors in file /oracle/11.1.0/diag/rdbms/bect/BECT/trace/BECT_j000_18678.trc (incident=20162):
ORA-00600: sis?inen virhekoodi, argumentit: [qksrcBuildRwo], [], [], [], [], [], [], []
Incident details in: /oracle/11.1.0/diag/rdbms/bect/BECT/incident/incdir_20162/BECT_j000_18678_i20162.trc
Tue Jun 30 10:32:57 2009
Trace dumping is performing id=[cdmp_20090630103257]
Tue Jun 30 10:32:59 2009
Sweep Incident[20162]: completed
Tue Jun 30 10:44:15 2009
ORA-1652: unable to extend temp segment by 128 in tablespace FENIX_SECURE
ORA-1652: unable to extend temp segment by 1024 in tablespace FENIX_SECURE
Errors in file /oracle/11.1.0/diag/rdbms/bect/BECT/trace/BECT_j000_21609.trc:
ORA-12012: virhe ty?n 58702 automaattisen suorituksen yhteydess?
ORA-01652: v?liaikaisen segmentin laajennus 1024:lla taulualueeessa FENIX_SECURE ei onnistu


TRACE FILE
----------------------
Display of relevant trace file :

Dump file /oracle/11.1.0/diag/rdbms/bect/BECT/incident/incdir_20162/BECT_j000_18678_i20162.trc
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, Oracle Label Security and Real Application Testing options
ORACLE_HOME = /oracle/11.1.0
System name: HP-UX
Node name: hellu
Release: B.11.23
Version: U
Machine: ia64
Instance name: BECT
Redo thread mounted by this instance: 1
Oracle process number: 20
Unix process pid: 18678, image: oracle@hellu (J000)


*** 2009-06-30 10:32:27.211
*** SESSION ID:(520.12850) 2009-06-30 10:32:27.211
*** CLIENT ID:(FOOBAR@192.168.60.110@Mozilla/5.0 (Windows; U; Windows NT 5.0; f) 2009-06-30 10:32:27.211
*** SERVICE NAME:(SYS$USERS) 2009-06-30 10:32:27.211
*** MODULE NAME:(DBMS_SCHEDULER) 2009-06-30 10:32:27.211
*** ACTION NAME:(ADV_SQL_TUNING_1246346508268) 2009-06-30 10:32:27.211

Dump continued from file: /oracle/11.1.0/diag/rdbms/bect/BECT/trace/BECT_j000_18678.trc
ORA-00600: sis?inen virhekoodi, argumentit: [qksrcBuildRwo], [], [], [], [], [], [], []

========= Dump for incident 20162 (ORA 600 [qksrcBuildRwo]) ========

*** 2009-06-30 10:32:27.219
----- Current SQL Statement for this session (sql_id=06y1876p6cr8a) -----
/* SQL Analyze(520,1) */
WITH TARGETS AS
 (SELECT COLUMN_VALUE TARGET_GUID
    FROM TABLE(CAST(:B1 AS MGMT_TARGET_GUID_ARRAY)))
SELECT /*+ ORDERED USE_NL(assoc) USE_NL(cfg)
NO_INDEX_FFS(assoc MGMT_POLICY_ASSOC_PK)
INDEX_ASC(assoc MGMT_POLICY_ASSOC_PK)
NO_INDEX_FFS(cfg MGMT_POLICY_ASSOC_CFG_PK_IDX)
INDEX_ASC(cfg MGMT_POLICY_ASSOC_CFG_PK_IDX) */
ASSOC.OBJECT_GUID TARGET_GUID, LEAD(ASSOC.OBJECT_GUID, 1) OVER(
 ORDER BY ASSOC.OBJECT_GUID, POLICY.POLICY_GUID, CFG.EVAL_ORDER) NEXT_TARGET_GUID, 
POLICY.POLICY_GUID POLICY_GUID, LEAD(POLICY.POLICY_GUID, 1) OVER(
 ORDER BY ASSOC.OBJECT_GUID, POLICY.POLICY_GUID, CFG.EVAL_ORDER) NEXT_POLICY_GUID, 
POLICY.POLICY_NAME, POLICY.POLICY_TYPE, DECODE(POLICY.POLICY_TYPE, :B5, 
NVL(CFG.MESSAGE, POLICY.MESSAGE), :B10, CFG.MESSAGE, NULL) MESSAGE, 
DECODE(POLICY.POLICY_TYPE, :B5, NVL(CFG.MESSAGE_NLSID, POLICY.MESSAGE_NLSID), :B10, 
CFG.MESSAGE_NLSID, NULL) MESSAGE_NLSID, DECODE(POLICY.POLICY_TYPE, :B5, 
NVL(CFG.CLEAR_MESSAGE, POLICY.CLEAR_MESSAGE), :B10, CFG.CLEAR_MESSAGE, NULL) 
CLEAR_MESSAGE, DECODE(POLICY.POLICY_TYPE, :B5, NVL(CFG.CLEAR_MESSAGE_NLSID, POLICY.CLEAR_MESSAGE_NLSID), :B10, 
CFG.CLEAR_MESSAGE_NLSID, NULL) 
CLEAR_MESSAGE_NLSID, POLICY.REPO_TIMING_ENABLED, :B4, 
POLICY.VIOLATION_LEVEL, 
DECODE(POLICY.POLICY_TYPE, :B5, :B11, 0) VIOLATION_TYPE, POLICY.CONDITION_TYPE, 
POLICY.CONDITION, DECODE(POLICY.POLICY_TYPE, :B5, NVL(CFG.CONDITION_OPERATOR, POLICY.CONDITION_OPERATOR), :B10, 
CFG.CONDITION_OPERATOR, 0) CONDITION_OPERATOR, 
CFG.KEY_VALUE, CFG.KEY_OPERATOR, CFG.IS_EXCEPTION, CFG.NUM_OCCURRENCES, 
NULL EVALUATION_DATE, CAST(MULTISET(
SELECT MGMT_POLICY_PARAM_VAL(PARAM_NAME, CRIT_THRESHOLD, WARN_THRESHOLD, INFO_THRESHOLD)
  FROM MGMT_POLICY_ASSOC_CFG_PARAMS PARAM
 WHERE PARAM.OBJECT_GUID = CFG.OBJECT_GUID AND PARAM.POLICY_GUID = CFG.POLICY_GUID 
AND PARAM.COLL_NAME = CFG.COLL_NAME AND PARAM.KEY_VALUE = CFG.KEY_VALUE AND 
PARAM.KEY_OPERATOR = CFG.KEY_OPERATOR) AS MGMT_POLICY_PARAM_VAL_ARRAY) PARAMS, DECODE(POLICY.CONDITION_TYPE, :B9, CAST(MULTISET(
SELECT MGMT_NAMEVALUE_OBJ.NEW(BIND_COLUMN_NAME, BIND_COLUMN_TYPE)
  FROM MGMT_POLICY_BIND_VARS BINDS
 WHERE BINDS.POLICY_GUID = POLICY.POLICY_GUID) AS MGMT_NAMEVALUE_ARRAY), 
MGMT_NAMEVALUE_ARRAY()) BINDS, DECODE(:B8, 0, MGMT_MEDIUM_STRING_ARRAY(), 1, MGMT_MEDIUM_STRING_ARRAY(CFG.KEY_VALUE), CAST((
SELECT MGMT_MEDIUM_STRING_ARRAY(KEY_PART1_VALUE, KEY_PART2_VALUE, KEY_PART3_VALUE, 
KEY_PART4_VALUE, KEY_PART5_VALUE)
  FROM MGMT_METRICS_COMPOSITE_KEYS COMP_KEYS
 WHERE COMP_KEYS.COMPOSITE_KEY = CFG.KEY_VALUE AND COMP_KEYS.TARGET_GUID = CFG.OBJECT_GUID) AS MGMT_MEDIUM_STRING_ARRAY)) KEY_VALUES
  FROM TARGETS, MGMT_POLICIES POLICY, MGMT_POLICY_ASSOC ASSOC, MGMT_POLICY_ASSOC_CFG CFG
 WHERE POLICY.METRIC_GUID = :B7 AND ASSOC.OBJECT_GUID = TARGETS.TARGET_GUID 
AND ASSOC.POLICY_GUID = POLICY.POLICY_GUID AND POLICY.POLICY_TYPE != :B6 AND 
(POLICY.POLICY_TYPE = :B5 OR ASSOC.COLL_NAME = :B4) AND ASSOC.OBJECT_TYPE = :B3 
AND ASSOC.IS_ENABLED = :B2 AND CFG.OBJECT_GUID = ASSOC.OBJECT_GUID AND CFG.POLICY_GUID = ASSOC.POLICY_GUID 
AND 
CFG.COLL_NAME = ASSOC.COLL_NAME
ORDER BY ASSOC.OBJECT_GUID, POLICY.POLICY_GUID, CFG.EVAL_ORDER, CFG.KEY_VALUE DESC

----- PL/SQL Stack -----
----- PL/SQL Call Stack -----
object line object
handle number name
c0000000bf9ee2e8 7294 package body SYS.DBMS_SQLTUNE_INTERNAL
c0000000bfbe76d0 8 SYS.WRI$_ADV_SQLTUNE
c0000000d513f948 545 package body SYS.PRVT_ADVISOR
c0000000d513f948 2597 package body SYS.PRVT_ADVISOR
c0000000cef08358 241 package body SYS.DBMS_ADVISOR
c0000000bc1e9ff8 718 package body SYS.DBMS_SQLTUNE
c0000000bc31c3a0 1 anonymous block

----- Call Stack Trace -----

Function List (to Full stack) (to Summary stack)

skdstdst <- ksedst1 <- ksedst <- dbkedDefDump <- ksedmp <- ksfdmp 
<- dbgexPhaseII <- dbgexProcessError <- dbgeExecuteForError <- dbgePostErrorKGE 
<- 1936 <- dbkePostKGE_kgsf <- 128 <- kgeadse <- kgerinv_internal <- kgerinv 
<- kgeasnmierr <- qksrcBuildRwo <- qknrcAllocate <- $cold_qkadrv <- opitca 
<- kksFullTypeCheck <- rpiswu2 <- kksSetBindType <- kksfbc <- opiexe <- kpoal8 
<- opiodr <- kpoodrc <- rpiswu2 <- kpoodr <- upirtrc <- kpurcsc <- kpuexec 
<- OCIStmtExecute <- qksanAnalyzeSql <- 272 <- qksanAnalyzeSegSql <- kestsaInitialRound 
<- kestsaAutoTuneSql <- kestsaAutoTuneDrv <- kestsTuneSqlDrv <- kesaiExecAction 
<- kesaiTuneSqlDrv <- 176 <- spefcifa <- spefmccallstd <- pextproc <- peftrusted 
<- psdexsp <- rpiswu2 <- psdextp <- pefccal <- pefcal <- pevm_FCAL <- pfrinstr_FCAL 
<- pfrrun_no_tool <- pfrrun <- plsql_run <- peicnt <- kkxexe <- opiexe <- kpoal8 <- opiodr 
<- kpoodr <- upirtrc <- kpurcsc <- kpuexec <- OCIStmtExecute <- jslvec_execcb 
<- jslvswu <- jslve_execute0 <- jslve_execute <- rpiswu2 <- kkjex1e <- kkjsexe <- kkjrdp 
<- opirip <- opidrv <- sou2o <- opimai_real <- main <- main_opd_entry

以上trace中值得注意的是stack trace记录:dbgeExecuteForError <- dbgePostErrorKGE <- dbkePostKGE_kgsf,通过匹配该stack trace point在MOS上可以找到2个相关的Bug记录:

Bug 8340928: XF11.2PREPEND_DML - TRC - QKSRCBUILDRWO:
When a column in the select list references to a view column which produces
a temp LOB, such as a TO_CLOB() operator, then an ORA-600 can occur
when using the result-cache.

Workaround
 Disable the result-cache

Bug 7314587: STARETL ORA-00600 INTERNAL ERROR CODE, ARGUMENTS [QERNCROWP1], [0], [2] RDBMS:

Ora-600 [Qksrcbuildrwo]
Applies to:
Oracle Server - Enterprise Edition - Version: 11.1.0.6 to 11.1.0.7 - Release: 11.1 to 11.1
Symptoms
Getting ORA-00600 [qksrcBuildRwo] in the alert log file when trying to execute select statement.

ERROR:
--------

ORA-00600: internal error code, arguments: [qksrcBuildRwo], [], [], [], [], [], [], [], [], [],
[], []

----- Call Stack Trace -----
dbgexProcessError dbgeExecuteForError dbgePostErrorKGE dbkePostKGE_kgsf
 kgeadse kgerinv_internal kgerinv kgeasnmierr qksrcBuildRwo
  qknrcAllocate qkadrv qkadrv qkadrv opitca kksLoadChild
   kxsGetRuntimeLock kksfbc kkspsc0 kksParseCursor opiosq0
    opiall0 opikpr opiodr rpidrus skgmstack rpidru rpiswu2
     kprball kprbprsu kkxs_parse kkxsprsclb pevm_icd_call_common
      pfrinstr_ICAL pfrrun_no_tool pfrrun plsql_run peicnt
       kkxexe opiexe opiall0 opikpr opiodr rpidrus skgmstack
        rpidru rpiswu2 kprball kzftExHandler kzftAuditExe kzftChkAudit

Changes
Result cache is enabled (result cache is a new feature in 11g).

From the alert log file
-----------------
result_cache_mode = "AUTO"

Cause
The ORA-00600 [Qksrcbuildrwo] is caused by
unpublished Bug 8340928 XF11.2PREPEND_DML - TRC - QKSRCBUILDRWO

Solution
1. Disable result_cache_mode in the spfile/pfile (remove it) or set it to MANUAL

At session level
------------
SQL> alter session set result_cache_mode='MANUAL';
-- Or
At system level
------------
SQL> alter system set result_cache_mode='MANUAL';
-- Or

2. Apply one off Patch 8340928 if available on My Oracle Support for your Oracle Version and Platform.
-- Or
3. Upgrade to 11.2 where unpublished Bug 8340928 is fixed.

可以确定该qksrcBuildRwo内部错误与字典表FILE$的讹误无关,而是由于11g release1中result cache的相关bug引起的;MOS建议通过不适用结果集缓存(result cache)特性来workaround这个错误,或者干脆升级到11g release(11.2.0.1以上,目前最新为11.2.0.2)。

沪ICP备14014813号-2

沪公网安备 31010802001379号