Oracle中的历史公案之翻案：dead transaction到底是谁做recover的？

本文地址：https://www.askmac.cn/archives/ga1.html

作者: Maclean 刘相兵，非授权禁止转载！

很多同学在学习ORACLE OCP的过程中遇到如下的问题：

In the middle of a transaction,a user session was abnormally terminated but the instance is 
still up and the database is open. Which two statements are true in the scenario（方案）? (Choose two).
A. Event viewer gives more details on the failure.
B. The alert log file gives detailed information about the failure.
C. PMON rolls back the transaction and releases the locks.
D. SMON rolls back the transaction and releases the locks.
E. The transaction is rolled backup by the next session that refers to any of the blocks updated by the failed transaction.
F. Data modified by the transaction up to the last commit before the abnormal termination is retained in the database.

标准答案为：C F ，也就是说ocp教材对于会话异常终止锁导致的事务回滚这个情况，认为是由PMON进程做回滚和释放锁的。

但如果我们去实际做个试验看看呢，在这个实验中我们模拟一个会话在做了一个较大的DELETE后被KILL掉，此时DELETE相关的事务需要被回滚，到底是不是PMON出手呢？

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

SQL> select  * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmac.cn

SQL>alter system set fast_start_parallel_rollback=false;
System altered.

设置10500，10046事件以跟踪SMON进程的行为

SQL> alter system set events '10500 trace name context forever,level 8';
System altered.

SQL> oradebug setospid 4424
Oracle pid: 8, Unix process pid: 4424, image: oracle@rh2.oracle.com (SMON)

SQL> oradebug event 10046 trace name context forever,level 8;
Statement processed.

在一个新的terminal中执行大批量的删除语句，在执行一段时间后使用操作系统命令将执行该删除操作的
服务进程kill掉，模拟一个大的dead transaction的场景

SQL> delete large_rb;
delete large_rb

[oracle@rh2 bdump]$ kill -9 4535

等待几秒后pmon进程会找出dead process:

[claim lock for dead process][lp 0x7000003c70ceff0][p 0x7000003ca63dad8.1290666][hist x9a514951]

在x$ktube内部视图中出现ktuxecfl(Transaction flags)标记为DEAD的记录:

SQL> select sum(distinct(ktuxesiz)) from x$ktuxe where ktuxecfl = 'DEAD';

SUM(DISTINCT(KTUXESIZ))
-----------------------
                  29386

SQL> /

SUM(DISTINCT(KTUXESIZ))
-----------------------
                  28816

以上KTUXESIZ代表事务所使用的undo块总数(number of undo blocks used by the transaction)

==================smon trace content==================
SMON: system monitor process posted
WAIT #0: nam='log file switch completion' ela= 0 p1=0 p2=0 p3=0 obj#=1 tim=1278243332801935
WAIT #0: nam='log file switch completion' ela= 0 p1=0 p2=0 p3=0 obj#=1 tim=1278243332815568
WAIT #0: nam='latch: row cache objects' ela= 95 address=2979418792 number=200 tries=1 obj#=1 tim=1278243333332734
WAIT #0: nam='latch: row cache objects' ela= 83 address=2979418792 number=200 tries=1 obj#=1 tim=1278243333356173
WAIT #0: nam='latch: undo global data' ela= 104 address=3066991984 number=187 tries=1 obj#=1 tim=1278243347987705
WAIT #0: nam='latch: object queue header operation' ela= 89 address=3094817048 number=131 tries=0 obj#=1 tim=1278243362468042
WAIT #0: nam='log file switch (checkpoint incomplete)' ela= 0 p1=0 p2=0 p3=0 obj#=1 tim=1278243419588202
Dead transaction 0x00c2.008.0000006d recovered by SMON

=====================
PARSING IN CURSOR #3 len=358 dep=1 uid=0 oct=3 lid=0 tim=1278243423594568 hv=3186851936 ad='ae82c1b8'
select smontabv.cnt,
       smontab.time_mp,
       smontab.scn,
       smontab.num_mappings,
       smontab.tim_scn_map,
       smontab.orig_thread
  from smon_scn_time smontab,
       (select max(scn) scnmax,
               count(*) + sum(NVL2(TIM_SCN_MAP, NUM_MAPPINGS, 0)) cnt
          from smon_scn_time
         where thread = 0) smontabv
 where smontab.scn = smontabv.scnmax
   and thread = 0

END OF STMT
PARSE #3:c=0,e=1354526,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,tim=1278243423594556
EXEC #3:c=0,e=106,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1278243423603269
FETCH #3:c=0,e=47065,p=0,cr=319,cu=0,mis=0,r=1,dep=1,og=4,tim=1278243423650375
*** 2011-06-24 21:19:25.899
WAIT #0: nam='smon timer' ela= 299999999 sleep time=300 failed=0 p3=0 obj#=1 tim=1278243716699171
kglScanDependencyHandles4Unpin():
  cumscan=3 cumupin=4 time=776 upinned=0

通过一个10500 trace level 8可以很容易让SMON开口，如上的试验给出了2个最重要的信息：

等待几秒后pmon进程会找出dead process:

[claim lock for dead process][lp 0x7000003c70ceff0][p 0x7000003ca63dad8.1290666][hist x9a514951]

Dead transaction 0x00c2.008.0000006d recovered by SMON

即负责找出dead process异常终止进程的当然是PMON，但实际这个dead transaction 是由SMON做recover的。

如果完全按照试验的结论可以发现,OCP教材上对于此题的答案其实是错误的，即选项C (C. PMON rolls back the transaction and releases the locks.)是错误的，
实际做rollback transaction的是SMON，SMON还会在回滚过程中必要的释放row lock行锁，当然其他内存锁资源还需要PMON去释放。

这是最终的事实吗？ OCP教材误导了学生好多年？
不完全是，其实这里受到隐藏参数_rollback_cleanup_entries 的影响，其大致算法如下：
PMON会负责找到dead transaction 的state object临时对象
如果dead transaction 涉及到的undo记录未超过隐藏参数_rollback_cleanup_entries(默认为100)，则PMON将负责回滚
否则PMON将post SMON，让SMON完成超过_rollback_cleanup_entries的部分
SMON 可以发起并行SLAVE帮助它一起完成rollback

分析ORACLE 源代码Undo事务管理头文件 ktucts.h – Kernel Transaction Undo Compile Time Services 可以看到：


   2904 /* Starting with 8.1.3, cleanup_rollback_entries is an underscore parameter */
   2905 KSPPAR_OBSOLETE("cleanup_rollback_entries", KSPPARM_MADE_UNDERSCORE)
   2906 
   2907 #define ktunud KSPPARDN(ktunud_)
   2908 KSPPARDV("_cleanup_rollback_entries", ktunud_, ktunud_s, ktunud_l, ktunud_p,
   2909          LCCMDINT,
   2910           0, NULLP(const text), 100, KSPLS1(NULLP(text)), KSPLS1(UB4MAXVAL), 0,
   2911           KSPLS1(NULLP(sword)),
   2912           "no. of undo entries to apply per transaction cleanup")

cleanup_rollback_entries参数是从版本8.1.3开始变成隐藏参数的，显然ORACLE当时已经认识到小的dead事务直接让PMON回滚，如果事务较大那么让PMON直接委托SMON来做，同时黄金的比例就是cleanup_rollback_entries=100的经典值。

由于OCP作为入门教材并不过分强调学习者对ORACLE内部原理的理解，所以只需要大众记住由PMON负责做dead transaction recover即可。

但如果我们更进一步地去观察ORACLE就会发现并不是那么回事。所以在过去20多年中对于到底是SMON还是PMON做dead transaction的rollback有着无休止的讨论，因为教材与实际的表现有着较大的差异。

在某些特殊的情况下ORACLE Support会建议你将_cleanup_rollback_entries设置为400或1000，这种情况并不多见，主要是为了调试一些性能问题。

通过这篇文章或许可以为此ORACLE历史公案小小的翻一下案，让幕后功臣SMON的默默功绩不至于彻底埋没。

Oracle中的历史公案之翻案：dead transaction到底是谁做recover的？

Comment 取消回复