11.2.0.3 下尝试使用11g health monitor新特性时出现了ORA-00604、ORA-01427, 查询MOS发现 (Bug 12385172: ORA-01427 WHEN EXECUTING DBMS_HM.RUN_CHECK),当 DB中存在case when then的function index时会触发该BUG:
SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production PL/SQL Release 11.2.0.3.0 - Production CORE 11.2.0.3.0 Production TNS for Linux: Version 11.2.0.3.0 - Production NLSRTL Version 11.2.0.3.0 - Production SQL> select * from global_name; GLOBAL_NAME -------------------------------------------------------------------------------- www.askmac.cn SQL> exec dbms_hm.run_check('Dictionary Integrity Check','check-2'); BEGIN dbms_hm.run_check('Dictionary Integrity Check','check-2'); END; * ERROR at line 1: ORA-00604: error occurred at recursive SQL level 1 ORA-01427: single-row subquery returns more than one row ORA-06512: at "SYS.DBMS_HM", line 191 ORA-06512: at line 1
可以通过以下脚本找出 DB中case when then类型的函数索引:
-- Determine DDL statements (note: this will take a while to return results!) set long 100000 exec dbms_metadata.set_transform_param(dbms_metadata.session_transform,'PRETTY',true); exec dbms_metadata.set_transform_param(dbms_metadata.session_transform,'TABLESPACE',false); exec dbms_metadata.set_transform_param(dbms_metadata.session_transform,'SEGMENT_ATTRIBUTES',false); exec dbms_metadata.set_transform_param(dbms_metadata.session_transform,'STORAGE',false); -- Checking the DDL statement col DDL form a100 word_wrapped select dbms_metadata.get_ddl(RTRIM(UPPER(object_type)), RTRIM(UPPER(object_name)), RTRIM(UPPER(owner))) DDL from DBA_OBJECTS where object_type='INDEX' and object_id in (select x from (select obj# x, obj#||','||intcol#, count(obj#||','||intcol#) from ICOLDEP$ group by obj#, obj#||','||intcol# having count(*) > 1) );
对于安装了APEX 组件或者在DBCA创建数据库时选择了General Purpose从Seed中clone数据库而非Custom Database的DB ,都会创建有”APEX_030200″.”WWV_FLOW_WORKSHEETS_UNQ_IDX”、”APEX_030200″.”WWV_FLOW_WS_UNQ_ALIAS_IDX”、”APEX_030200″.”WWV_FLOW_WORKSHEET_RPTS_UK” 三个函数索引。
如果没有实际使用APEX组件的话,我们可以直接DROP掉APEX_030200:
SQL> drop user "APEX_030200" cascade; User dropped. SQL> set long 100000 SQL> SQL> exec dbms_metadata.set_transform_param(dbms_metadata.session_transform,'PRETTY',true); PL/SQL procedure successfully completed. SQL> exec dbms_metadata.set_transform_param(dbms_metadata.session_transform,'TABLESPACE',false); PL/SQL procedure successfully completed. SQL> exec dbms_metadata.set_transform_param(dbms_metadata.session_transform,'SEGMENT_ATTRIBUTES',false); PL/SQL procedure successfully completed. SQL> exec dbms_metadata.set_transform_param(dbms_metadata.session_transform,'STORAGE',false); PL/SQL procedure successfully completed. SQL> SQL> -- Checking the DDL statement SQL> col DDL form a100 word_wrapped SQL> select dbms_metadata.get_ddl(RTRIM(UPPER(object_type)), 2 RTRIM(UPPER(object_name)), 3 RTRIM(UPPER(owner))) DDL 4 from DBA_OBJECTS 5 where object_type='INDEX' 6 and object_id 7 in (select x from (select obj# x, obj#||','||intcol#, count(obj#||','||intcol#) 8 from ICOLDEP$ 9 group by obj#, obj#||','||intcol# having count(*) > 1) 10 ); no rows selected
再次尝试测试health check dictionary 发现问题仍存在:
SQL> exec dbms_hm.run_check('Dictionary Integrity Check','check-mac3'); BEGIN dbms_hm.run_check('Dictionary Integrity Check','check-mac3'); END; * ERROR at line 1: ORA-00604: error occurred at recursive SQL level 1 ORA-01427: single-row subquery returns more than one row ORA-06512: at "SYS.DBMS_HM", line 191 ORA-06512: at line 1
到这一步决定自己来诊断这个ORA-01427错误的根源, 因为是递归SQL层出现故障,所以这里我们可以用到ERRORSTACK来深入了解问题:
SQL> oradebug setmypid; Statement processed. SQL> oradebug event 1427 trace name errorstack level 4; Statement processed. /* 以上我们设置当触发1427错误事件时TRACE level 4的错误堆栈ERRORSTACK */ SQL> exec dbms_hm.run_check('Dictionary Integrity Check','check-mac4'); BEGIN dbms_hm.run_check('Dictionary Integrity Check','check-mac4'); END; * ERROR at line 1: ORA-00604: error occurred at recursive SQL level 1 ORA-01427: single-row subquery returns more than one row ORA-06512: at "SYS.DBMS_HM", line 191 ORA-06512: at line 1 /* 触发ORA-01427 错误 将生成相关TRACE 信息*/ SQL> oradebug tracefile_name /s01/orabase/diag/rdbms/vprod/VPROD1/trace/VPROD1_ora_7781.trc
来进一步观察生成的TRACE文件:
*** 2012-04-30 09:20:55.438 dbms_hm: (In run_check) Begin dbkhicd_run_check dbkh_run_check_internal: BEGIN; check_namep=Dictionary Integrity Check, run_namep=check-mac4 dbkh_run_check_internal: BEGIN; timeout=0 dbkh_run_check_internal: AFTER RUN CREATE; run_id=1281 *** 2012-04-30 09:20:55.603 dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=4, mask=0x0) ----- Error Stack Dump ----- ORA-01427: single-row subquery returns more than one row ----- Current SQL Statement for this session (sql_id=gxjzd1s7m8xfj) ----- select 52, rowid, 'ind$.obj#' from IND$ where obj# < 0 union all select 57, rowid, 'ind$.type#' from IND$ where type# not between 1 and 9 union all select 58, rowid, 'ind$.pctfree$' from IND$ where pctfree$ not between 0 and 99 union all select 59, rowid, 'ind$.analyzetime <= SYSDATE' from IND$ where analyzetime > SYSDATE union all select 51, rowid, 'ind$.obj# pk' from IND$ where obj# is null union all select 51, rowid, 'ind$.obj# pk' from IND$ where 1 > (select obj# from IND$ group by obj# having count(*) > 1) union all select 53, rowid, 'ind$.dataobj# range' from IND$ where 1 > (select dataobj# from IND$ group by dataobj# having count(*) > 1) union all select 54, rowid, 'ind$.ts# fk' from IND$ where (ts#) in (select ts# from IND$ where (ts#) not in (select ts# from ts$) and ts# != 2147483647) union all select 55, rowid, 'ind$.ts,file,block fk' from IND$ where (ts#, file#, block#) in (select ts#, file#, block# from IND$ where (ts#, file#, block#) not in (select ts#, file#, block# from seg$) and file# != 0 and block# != 0) union all select 56, rowid, 'ind$.obj# fk_obj$' from IND$ where (obj#) in (select obj# from IND$ where (obj#) not in (select obj# from obj$)) ----- PL/SQL Stack ----- ----- PL/SQL Call Stack ----- object line object handle number name 0xb1269160 191 package body SYS.DBMS_HM 0xb1d9f600 1 anonymous block
实际触发ORA-01427的是一条较长的递归SQL语句,该SQL由多个部分UNION ALL组合而成负责检测IND$基表是否存在逻辑不一致, 实际检测可以发现真真存在问题的是 这一段SQL:
select 53, rowid, 'ind$.dataobj# range' from IND$ where 1 > (select dataobj# from IND$ group by dataobj# having count(*) > 1) ERROR at line 4: ORA-01427: single-row subquery returns more than one row SQL> select dataobj# from IND$ group by dataobj# having count(*) > 1; DATAOBJ# ---------- 75601 75599 75594 75605
IND$ 基表上居然存在多条dataobj#重复的记录,我们来看看是哪些对象:
select /*+ first_rows */ owner, object_name, data_object_id from dba_objects where data_object_id in (select dataobj# from IND$ group by dataobj# having count(*) > 1) order by 3 ; OWNER OBJECT_NAME DATA_OBJECT_ID ------------------------------ ------------------------------ -------------- SYS SYS_C0010990 75594 OE WHS_LOCATION_IX 75594 OE ORD_CUSTOMER_IX 75599 SYS SYS_IOT_TOP_75598 75599 SYS SYS_IOT_TOP_75600 75601 OE CUST_ACCOUNT_MANAGER_IX 75601 OE PROD_SUPPLIER_IX 75605 SYS SYS_IOT_TOP_75603 75605 8 rows selected.
OE这个Sample Schema下的多个索引居然和SYS用户的一些索引的DATA_OBJECT_ID重号; 我们不可能去改动SYS下的对象,而OE这个Schema则无关紧要,删除这些OE下的问题索引:
SQL> drop index oe.WHS_LOCATION_IX; Index dropped. SQL> drop index oe.ORD_CUSTOMER_IX; Index dropped. SQL> drop index oe.CUST_ACCOUNT_MANAGER_IX; Index dropped. SQL> drop index oe.PROD_SUPPLIER_IX; Index dropped. SQL> select dataobj# from IND$ group by dataobj# having count(*) > 1; DATAOBJ# ----------
再次测试后成功执行Dictionary Integrity Check
SQL> exec dbms_hm.run_check('Dictionary Integrity Check','check-mac5'); PL/SQL procedure successfully completed. SQL> set pause on; SQL> spool dic_check SQL> SET LONG 100000 SQL> SET LONGCHUNKSIZE 1000 SQL> SET PAGESIZE 100 SQL> SET LINESIZE 512 SQL> SELECT DBMS_HM.GET_RUN_REPORT('CHECK-MAC5') FROM DUAL; DBMS_HM.GET_RUN_REPORT('CHECK-MAC5') ----------------------------------------------------- Basic Run Information Run Name : check-mac5 Run Id : 1301 Check Name : Dictionary Integrity Check Mode : MANUAL Status : COMPLETED Start Time : 2012-04-30 09:33:28.540140 -04:00 End Time : 2012-04-30 09:33:32.303679 -04:00 Error Encountered : 0 Source Incident Id : 0 Number of Incidents Created : 0 Input Paramters for the Run TABLE_NAME=ALL_CORE_TABLES CHECK_MASK=ALL Run Findings And Recommendations Finding Finding Name : Dictionary Inconsistency Finding ID : 1302 Type : FAILURE Status : OPEN Priority : CRITICAL Message : SQL dictionary health check: syn$.owner fk 95 on object SYN$ failed Message : Damaged rowid is AAAABEAABAAANWgAB7 - description: Synonymn APEX is referenced Finding Finding Name : Dictionary Inconsistency Finding ID : 1305 Type : FAILURE Status : OPEN Priority : CRITICAL Message : SQL dictionary health check: syn$.owner fk 95 on object SYN$ failed Message : Damaged rowid is AAAABEAABAAANWhAAu - description: Synonymn APEXWS is referenced Finding Finding Name : Dictionary Inconsistency Finding ID : 1308 Type : FAILURE Status : OPEN Priority : CRITICAL Message : SQL dictionary health check: syn$.owner fk 95 on object SYN$ failed Message : Damaged rowid is AAAABEAABAAANWgACO - description: Synonymn APEX_ACTIVITY_LOG is referenced Finding Finding Name : Dictionary Inconsistency Finding ID : 1311 Type : FAILURE Status : OPEN Priority : CRITICAL Message : SQL dictionary health check: syn$.owner fk 95 on object SYN$ failed Message : Damaged rowid is AAAABEAABAAANWgABl - description: Synonymn APEX_ADMIN is referenced Finding Finding Name : Dictionary Inconsistency Finding ID : 1314 Type : FAILURE Status : OPEN Priority : CRITICAL Message : SQL dictionary health check: syn$.owner fk 95 on object SYN$ failed Message : Damaged rowid is AAAABEAABAAANWgACB - description: Synonymn APEX_APPLICATION is referenced
这个case希望大家能了解的是对于ORA-00604这类递归SQL层的错误,报错信息本身给出的诊断信息是不完整的,需要我们通过一些工具来深入了解实际引发错误的是哪一条SQL语句,这些Recusive SQL出错的主要原因往往是BUG、或者数据字典存在不一致。如何在脱离MOS和SR帮助的情况下,安全地WorkAround绕过这个错误。