了解你所不知道的SMON功能(三):清理obj$基表

SMON的作用还包括清理obj$数据字典基表(cleanup obj$)

OBJ$字典基表是Oracle Bootstarp启动自举的重要对象之一:

SQL> set linesize 80 ;
SQL> select sql_text from bootstrap$ where sql_text like 'CREATE TABLE OBJ$%';

SQL_TEXT
--------------------------------------------------------------------------------
CREATE TABLE OBJ$("OBJ#" NUMBER NOT NULL,"DATAOBJ#" NUMBER,"OWNER#" NUMBER NOT N
ULL,"NAME" VARCHAR2(30) NOT NULL,"NAMESPACE" NUMBER NOT NULL,"SUBNAME" VARCHAR2(
30),"TYPE#" NUMBER NOT NULL,"CTIME" DATE NOT NULL,"MTIME" DATE NOT NULL,"STIME"
DATE NOT NULL,"STATUS" NUMBER NOT NULL,"REMOTEOWNER" VARCHAR2(30),"LINKNAME" VAR
CHAR2(128),"FLAGS" NUMBER,"OID$" RAW(16),"SPARE1" NUMBER,"SPARE2" NUMBER,"SPARE3
" NUMBER,"SPARE4" VARCHAR2(1000),"SPARE5" VARCHAR2(1000),"SPARE6" DATE) PCTFREE
10 PCTUSED 40 INITRANS 1 MAXTRANS 255 STORAGE (  INITIAL 16K NEXT 1024K MINEXTEN
TS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 OBJNO 18 EXTENTS (FILE 1 BLOCK 121))

触发场景

OBJ$基表是一张低级数据字典表，该表几乎对库中的每个对象(表、索引、包、视图等)都包含有一行记录。很多情况下，这些条目所代表的对象是不存在的对象(non-existent)，引起这种现象的一种可能的原因是对象本身已经被从数据库中删除了，但是对象条目仍被保留下来以满足消极依赖机制(negative dependency)。因为这些条目的存在会导致OBJ$表不断膨胀，这时就需要由SMON进程来删除这些不再需要的行。SMON会在实例启动(after startup of DB is started cleanup function again)时以及启动后的每12个小时执行一次清理任务(the cleanup is scheduled to run after startup and then every 12 hours)。

我们可以通过以下演示来了解SMON清理obj$的过程:

SQL>  BEGIN
  2      FOR i IN 1 .. 5000 LOOP
  3      execute immediate ('create synonym gustav' || i || ' for
  4  perfstat.sometable');
  5      execute immediate ('drop   synonym gustav' || i );
  6      END LOOP;
  7    END;
  8    /

PL/SQL procedure successfully completed.

SQL> startup force;
ORACLE instance started.

Total System Global Area 1065353216 bytes
Fixed Size                  2089336 bytes
Variable Size             486542984 bytes
Database Buffers          570425344 bytes
Redo Buffers                6295552 bytes
Database mounted.
Database opened.

SQL>   select count(*) from user$ u, obj$ o
  2        where u.user# (+)=o.owner# and o.type#=10 and not exists
  3        (select p_obj# from dependency$ where p_obj# = o.obj#);

  COUNT(*)
----------
      5000

SQL> /

  COUNT(*)
----------
      5000

SQL> /

  COUNT(*)
----------
      4951

SQL> oradebug setospid 18457;
Oracle pid: 8, Unix process pid: 18457, image: oracle@rh2.oracle.com (SMON)

SQL> oradebug event 10046 trace name context forever ,level 1;
Statement processed.

SQL> oradebug tracefile_name;
/s01/admin/G10R2/bdump/g10r2_smon_18457.trc

select o.owner#,
       o.obj#,
       decode(o.linkname,
              null,
              decode(u.name, null, 'SYS', u.name),
              o.remoteowner),
       o.name,
       o.linkname,
       o.namespace,
       o.subname
  from user$ u, obj$ o
 where u.use r#(+) = o.owner#
   and o.type# = :1
   and not exists
 (select p_obj# from dependency$ where p_obj# = o.obj#)
 order by o.obj#
   for update

select null
  from obj$
 where obj# = :1
   and type# = :2
   and obj# not in
       (select p_obj# from dependency$ where p_obj# = obj$.obj#)

delete from obj$ where obj# = :1

/* 删除过程其实较为复杂，可能要删除多个字典基表上的记录 */

现象

我们可以通过以下查询来了解obj$基表中NON-EXISTENT对象的条目总数(type#=10)，若这个总数在不断减少说明smon正在执行清理工作

    select trunc(mtime), substr(name, 1, 3) name, count(*)
      from obj$
     where type# = 10
       and not exists (select * from dependency$ where obj# = p_obj#)
     group by trunc(mtime), substr(name, 1, 3);

      select count(*)
        from user$ u, obj$ o
       where u.user#(+) = o.owner#
         and o.type# = 10
         and not exists
       (select p_obj# from dependency$ where p_obj# = o.obj#);

如何禁止SMON清理obj$基表

我们可以通过设置诊断事件event=’10052 trace name context forever’来禁止SMON清理obj$基表，当我们需要避免SMON因cleanup obj$的相关代码而意外终止或spin从而开展进一步的诊断时可以设置该诊断事件。在Oracle并行服务器或RAC环境中，也可以设置该事件来保证只有特定的某个节点来执行清理工作。

10052, 00000, "don't clean up obj$"

alter system set events '10052 trace name context forever, level 65535';

Problem Description: We are receiving the below warning during db startup:
WARNING: kqlclo() has detected the following :
Non-existent object 37336 NOT deleted because an object
of the same name exists already.
Object name: PUBLIC.USER$

This is caused by the SMON trying to cleanup the SYS.OJB$.
SMON cleans all dropped objects which have a SYS.OBJ$.TYPE#=10.
This can happen very often when you create an object that have the same name as a public synonym.

When SMON is trying to remove non-existent objects and fails because there are duplicates,
multiple nonexistent objects with same name.
This query will returned many objects with same name under SYS schema:

select o.name,u.user# from user$ u, obj$ o where u.user# (+)=o.owner# and o.type#=10
and not exists (select p_obj# from dependency$ where p_obj# = o.obj#);

To cleanup this message:

Take a full backup of the database - this is crucial. If anything goes wrong during this procedure,
your only option would be to restore from backup, so make sure you have a good backup before proceeding.
We suggest a COLD backup. If you plan to use a HOT backup, you will have to restore point in time if any problem happens

Normally DML against dictionary objects is unsupported,
but in this case we know exactly what the type of corruption,
also you are instructing to do this under guidance from Support.

Data dictionary patching must be done by an experienced DBA.
This solution is unsupported.
It means that if there were problems after applying this solution, a database backup must be restored.

1. Set event 10052 at parameter file to disable cleanup of OBJ$ by SMON

EVENT="10052 trace name context forever, level 65535"

2. Startup database in restricted mode

3. Delete from OBJ$, COMMIT

SQL> delete from obj$ where (name,owner#) in ( select o.name,u.user# from user$ u, obj$ o
where u.user# (+)=o.owner# and o.type#=10 and not exists (select p_obj# from
dependency$ where p_obj# = o.obj#) );

SQL> commit;

SQL> Shutdown abort.

4. remove event 10052 from init.ora

5. Restart the database and monitor for the message in the ALERT LOG file

Hdr: 3530448 9.2.0.4 RDBMS 9.2.0.4 TXN MGMT LOCAL PRODID-5 PORTID-23 3139157 Abstract: ORA-601 IN KQLCLO FUNCTION CAUSES SMON TO STOP CLEANING UP DELETED OBJ$ OBJECTS PROBLEM: -------- Ct has many deleted rows in obj$ which should be cleaned up by kqlclo smon function. This is scheduled to run after startup and then every 12 hours. During execution of this cleanup function often occours ORA-601 "Cleanup Lock Conflict" error from delete of obj$ row. After this error all cleanup is interupted. No error message is written into alert.log file. There stay thousands of rows in obj$ which should be cleaned up. Next time is function scheduled to execute after 12 hours, but during that time new objects in obj$ are created which should be deleted, so number of non-deleted objects is still increasing. This causes performance problems - after startup of DB is started cleanup function again. When Ct needs then shutdown database, it is hanged even over 40 minutes, till cleanup is completed. Other problem is, that count of non deleted objects in obj$ is increasing. There is no way how to manually remove them - only repeat shutdown/startup or manually by calling kqlclo function from oradebug (which is unsupported). It is also not possible to change 12 hour frequency as it is hard coded in ktmmon. If we get ORA-601 for one objects, at least others should be deleted. DIAGNOSTIC ANALYSIS: -------------------- Producing 10046 event from SMON shows: ===================== PARSING IN CURSOR #3 len=32 dep=1 uid=0 oct=7 lid=0 tim=1793719793264 hv=3821382080 ad='3badbeb0' delete from obj$ where obj# = :1 END OF STMT PARSE #3:c=0,e=20,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1793719793257 EXEC #1:c=290000,e=280180,p=0,cr=6659,cu=579,mis=0,r=0,dep=1,og=4,tim=1793720091068 ERROR #1:err=601 tim=183676937 WAIT #0: nam='log file sync' ela= 3444 p1=4532 p2=0 p3=0 *** 12:56:06.498 WAIT #0: nam='smon timer' ela= 69045262 p1=300 p2=0 p3=0 ======================= After err=601 SMON stops executing cleanup and gets iddle - wait event 'smon timer'. Folowing selects shows not cleaned objects: select trunc (mtime), substr(name,1,3) name, count(*) from obj$ where type#=10 and not exists ( select * from dependency$ where obj#=p_obj# ) group by trunc(mtime), substr(name,1,3); TRUNC(MTIME) NAM COUNT(*) ----------------- --- ---------- 23-mar-2004 00:00 GL_ 5894 24-mar-2004 00:00 GL_ 5947 This is system with Oracle Applications, which is creating such high number of deleted objects - no way to change this. WORKAROUND: ----------- execute kqlclo function using oradebug - not supported RELATED BUGS: ------------- BUG 3001270 REPRODUCIBILITY: ---------------- 9.2.0.4 Solaris SPARC 64bit - Customer system 9.2.0.4 Windows 2000 - inhouse with testcase 9.2.0.4 Linux - inhouse with testcase TEST CASE: ---------- 1. Create type#10 objects using: BEGIN FOR i IN 1 .. 5000 LOOP execute immediate ('create synonym gustav' || i || ' for perfstat.sometable'); execute immediate ('drop synonym gustav' || i ); END LOOP; END; / 2. startup force DB. 3. monitor cleanup progress using: select count(*) from user$ u, obj$ o where u.user# (+)=o.owner# and o.type#=10 and not exists (select p_obj# from dependency$ where p_obj# = o.obj#); 4. when count stops decreasing, check that smon is iddle - 'smon timer' in v$session_wait. There stay objects which are not cleaned up. 10046 trace of smon from this testcase will show err=601 error.

Hdr: 1310142 8.1.6 RDBMS 8.1.6 DICTIONARY PRODID-5 PORTID-610 ORA-600 Abstract: SMON CRASHES INSTANCE WITH ERROR ORA-600 [16224] ========================= PROBLEM: 10-node OPS. On May 17, SMON encounters ORA-600 [16224] and terminates the instance. This had happened once before on 5-May-2000, however when SMON got ORA-600 [16224] on 5-May, it did not terminate the instance. ========================= DIAGNOSTIC ANALYSIS: smon_23888_pyrt.trc.f2n7 trace file shows that smon was executing the following sql at the time it got the error: select o.owner#,o.obj#,decode(o.linkname,null, decode(u.name,null,'SYS',u.name), o.remoteowner), o.name,o.linkname,o.namespace from user$ u, obj$ o where u.user# (+)=o.owner# and o.type#=:1 and not exists (select p_obj# from dependency$ where p_obj# = o.obj#) The value of the bind variable for type# was 10. This looks very similar to bug 1279050. The stack trace below is the same as bug 1279050. Based on the discussion in that bug, I had the customer issue the above query to see how many rows in obj$ were for non-existent objects. The query came back with 9 rows: OWNER# OBJ# DECODE(O.LINKNAME,NULL,DECODE( NAME LINKNAME NAMESPACE ---------- ---------- ------------------------------ ------------------------------ ---------- ---------- 64 9887 SYSWIR MD_TRANSACTION_EXCEPTION_LOG 1 1 9888 PUBLIC MD_TRANSACTION_EXCEPTION_LOG 1 22 9905 PAYMGR CSD_OUT 1 1 9906 PUBLIC CSD_OUT 1 22 9910 PAYMGR L_CSD_PROG 1 1 9911 PUBLIC L_CSD_PROG 1 65 9912 CONDET MD_TRANSACTIONS 1 22 9926 PAYMGR L_REF_SRCE 1 1 9927 PUBLIC L_REF_SRCE 1 ========================= WORKAROUND: None ========================= RELATED BUGS: 1279050 ========================= REPRODUCIBILITY: 1. State if the problem is reproducible; indicate where and predictability Not reproducible at will. 2. List the versions in which the problem has reproduced 8.1.6 ========================= TESTCASE: None ======================== STACK TRACE: *** ID:(7.1) 2000-05-17 01:25:50.595 *** 01:25:50.595 ksedmp: internal or fatal error ORA-600: internal error code, arguments: [16224], [], [], [], [], [], [], [] ----- Call Stack Trace ----- calling call entry argument values in hex location type point (? means dubious value) -------------------- -------- -------------------- ---------------------------- ksedmp+00d0 bl ksedst 119FE2C4 ? ksfdmp+0018 bl ksedmp 20009594 ? kgeriv+00f0 bl _ptrgl kgeasi+00cc bl kgeriv 65041 ? 594D4752 ? 0 ? 0 ? 0 ? kqlclo+05e8 bl kgeasi 20009520 ? 20118264 ? 3F60 ? 2 ? 0 ? 20001F68 ? 0 ? 4C12D663 ? ktmmon+0bd0 bl kqlclo ksbrdp+0244 bl _ptrgl opirip+0224 bl ksbrdp opidrv+0498 bl opirip 10001490 ? 0 ? 0 ? sou2o+0028 bl opidrv 32 ? 0 ? 0 ? main+0248 bl sou2o F02934F0 ? F02934E0 ? 2FF22D00 ? 0 ? __start+0088 bl main 0 ? 0 ? ========================= SUPPORTING INFORMATION: Files on bug$:[bug1310142] ========================= alert_pyrt.log - alert log on 5/17/00, showing ORA-600 [16224] error. ora600_event.log - the above results of the query smon_18710_pyrt.trc - The smon trace file from 5/5/00. smon_23888_pyrt.trc - The smon trace file from 5/17/00 Norman, can you check if these 9 rows are static...that is, are they always returned when the query is run? Had ct. re-run the query. Results are in orcl_supp1.txt. Comparing this to ora600_event.log. No, the obj# numbers are changing. So the rows are not static. However, these two queries are now days apart. They had turned on event 10052, but then commented it out when they had problems with the node crashing. I don't think that event 10052 had anything to do with their node crashing, but it was the only thing that changed, so the event was commented out. Given these circumstances, it is understandable that the results from the query are changing.. Had ct. rerun the query again. The results are in orcl_supp2.log. This shows the output is changing, with obj# 10365 - 10368. orcl_supp1.txt had 16 rows between obj#10337 and 10354. Given that the entries in obj$ of type#=10 are changing, it doesn't seem that there is any corruption in obj$ itself which is causing failure to clean up objects. I have started working on this. I will have more detailed information by tomorrow. In the meanwhile, can you tell me what "CSD_OUT" is? I am particularly interested in the one that resides in the PUBLIC schema. Also, can you tell me how this object is used? Thanks. ]] During cleanup of OBJ$, SMON no longer crashes ]] the instance with error ORA-600[16224]

Comments

maclean says

2011/06/09 at 20:06

admin says

2011/06/09 at 20:08

sqlrush says

2011/12/07 at 14:28

我在11.2.0.2下做了这个测试，收获到的结果有点意思。每次设定10046后，smon开始删除过期信息，如果执行查询的话，会出现DFS lock handle等待，删除停止。

回复
- Maclean Liu says
  
  2011/12/07 at 18:50
  
  Good!
  
  DFS lock handle 一般出现在RAC中，可能是为了获取Cross Instance CI的enqueue资源
  
  回复
loryliu says

2013/08/10 at 16:43

设置10500也可以看到instance在启动的时候在清理obj$SMON: obj$ cleanup beginSMON: obj$ cleanup end. more:0

回复

了解你所不知道的SMON功能(三):清理obj$基表

Comments

Trackbacks

Comment 取消回复