今天在 ORACLE ALLSTAR群里讨论了一个关于队列锁死锁检测的问题,原帖子在这里。 有同学指出对于enqueue lock的死锁检测应当是每3 秒钟检测一次,这样说的依据是通过一个简单可重复的实验可以证明在实际出现ora-00060 dead lock错前process等待了3s:
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi
PL/SQL Release 10.2.0.5.0 - Production
CORE 10.2.0.5.0 Production
TNS for Linux: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production
SQL> select * from global_name;
GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmac.cn
PROCESS A:
set timing on;
update maclean1 set t1=t1+1;
PROCESS B:
update maclean2 set t1=t1+1;
PROCESS A:
update maclean2 set t1=t1+1;
PROCESS B:
update maclean1 set t1=t1+1;
等待3s后 PROCESS A 会报
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
Elapsed: 00:00:03.02
可以看到Process A在检测到死锁之前确实等待了 3s,而且这是一个可以重复的实验,很具有说服力。
事实真的是这样的吗?
来看下面的演示:
SQL> col name for a30 SQL> col value for a5 SQL> col DESCRIB for a50 SQL> set linesize 140 pagesize 1400 SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE, x.ksppdesc describ 2 FROM SYS.x$ksppi x, SYS.x$ksppcv y 3 WHERE x.inst_id = USERENV ('Instance') 4 AND y.inst_id = USERENV ('Instance') 5 AND x.indx = y.indx 6 AND x.ksppinm='_enqueue_deadlock_scan_secs'; NAME VALUE DESCRIB ------------------------------ ----- -------------------------------------------------- _enqueue_deadlock_scan_secs 0 deadlock scan interval SQL> alter system set "_enqueue_deadlock_scan_secs"=18 scope=spfile; System altered. Elapsed: 00:00:00.01 SQL> startup force; ORACLE instance started. Total System Global Area 851443712 bytes Fixed Size 2100040 bytes Variable Size 738198712 bytes Database Buffers 104857600 bytes Redo Buffers 6287360 bytes Database mounted. Database opened. PROCESS A: SQL> set timing on; SQL> update maclean1 set t1=t1+1; 1 row updated. Elapsed: 00:00:00.06 Process B SQL> update maclean2 set t1=t1+1; 1 row updated. SQL> update maclean1 set t1=t1+1; Process A: SQL> SQL> alter session set events '10704 trace name context forever,level 10:10046 trace name context forever,level 8'; Session altered. SQL> update maclean2 set t1=t1+1; update maclean2 set t1=t1+1 * ERROR at line 1: ORA-00060: deadlock detected while waiting for resource Elapsed: 00:00:18.05 ksqcmi: TX,90011,4a9 mode=6 timeout=21474836 WAIT #12: nam='enq: TX - row lock contention' ela= 2930070 name|mode=1415053318 usn<<16 | slot=589841 sequence=1193 obj#=56810 tim=1308114759849120 WAIT #12: nam='enq: TX - row lock contention' ela= 2930636 name|mode=1415053318 usn<<16 | slot=589841 sequence=1193 obj#=56810 tim=1308114762779801 WAIT #12: nam='enq: TX - row lock contention' ela= 2930439 name|mode=1415053318 usn<<16 | slot=589841 sequence=1193 obj#=56810 tim=1308114765710430 *** 2012-06-12 09:58:43.089 WAIT #12: nam='enq: TX - row lock contention' ela= 2931698 name|mode=1415053318 usn<<16 | slot=589841 sequence=1193 obj#=56810 tim=1308114768642192 WAIT #12: nam='enq: TX - row lock contention' ela= 2930428 name|mode=1415053318 usn<<16 | slot=589841 sequence=1193 obj#=56810 tim=1308114771572755 WAIT #12: nam='enq: TX - row lock contention' ela= 2931408 name|mode=1415053318 usn<<16 | slot=589841 sequence=1193 obj#=56810 tim=1308114774504207 DEADLOCK DETECTED ( ORA-00060 ) [Transaction Deadlock] The following deadlock is not an ORACLE error. It is a deadlock due to user error in the design of an application or from issuing incorrect ad-hoc SQL. The following information may aid in determining the deadlock:
以上可以看到Process A从’enq: TX – row lock contention’ 等待到触发ORA-00060 deadlock detected的时间从3s 变成了 18s , 恰巧是hidden parameter “_enqueue_deadlock_scan_secs”所指定的值,该隐藏参数默认为0。
我们再来看另一个实验:
SQL> alter system set "_enqueue_deadlock_scan_secs"=4 scope=spfile; System altered. Elapsed: 00:00:00.01 SQL> alter system set "_enqueue_deadlock_time_sec"=9 scope=spfile; System altered. Elapsed: 00:00:00.00 SQL> startup force; ORACLE instance started. Total System Global Area 851443712 bytes Fixed Size 2100040 bytes Variable Size 738198712 bytes Database Buffers 104857600 bytes Redo Buffers 6287360 bytes Database mounted. Database opened. SQL> set linesize 140 pagesize 1400 SQL> show parameter dead NAME TYPE VALUE ------------------------------------ -------------------------------- ------------------------------ _enqueue_deadlock_scan_secs integer 4 _enqueue_deadlock_time_sec integer 9 SQL> set timing on SQL> select * from maclean1 for update wait 8; T1 ---------- 11 Elapsed: 00:00:00.01 PROCESS B SQL> select * from maclean2 for update wait 8; T1 ---------- 3 SQL> select * from maclean1 for update wait 8; select * from maclean1 for update wait 8 PROCESS A SQL> select * from maclean2 for update wait 8; select * from maclean2 for update wait 8 * ERROR at line 1: ORA-30006: resource busy; acquire with WAIT timeout expired Elapsed: 00:00:08.00
以上我们可以看到 当指定 select for update wait的enqueue request timeout 等待超时为8s时 ,虽然我们指定了”_enqueue_deadlock_scan_secs”=4(deadlock scan interval
),即4s做一次deadlock detected,但是实际Process A并没有做deadlock 的检测, 为什么这样说?
因为Process A在这里等待8s后raised的是”ORA-30006: resource busy; acquire with WAIT timeout expired”错误,而非ORA-00060,这就说明了process A没有做死锁检测。
这是因为隐藏参数”_enqueue_deadlock_time_sec”(requests with timeout <= this will not have deadlock detection)的存在,当enqueue request time < “_enqueue_deadlock_time_sec”时Server process就会放弃做dead lock detection,这是因为即然指定了enqueue request 超时的时间且timeout值又不是很大(_enqueue_deadlock_time_sec默认值为5,即timeout<5s),那么不做死锁检测也是可以接受的;而对于指定的timeout>”_enqueue_deadlock_time_sec”的情况,Oracle仍需要定时做死锁检测避免长时间的死锁发生。
我们再来看下面的演示:
SQL> show parameter dead
NAME TYPE VALUE
------------------------------------ -------------------------------- ------------------------------
_enqueue_deadlock_scan_secs integer 4
_enqueue_deadlock_time_sec integer 9
Process A:
SQL> set timing on;
SQL> select * from maclean1 for update wait 10;
T1
----------
11
Process B:
SQL> select * from maclean2 for update wait 10;
T1
----------
3
SQL> select * from maclean1 for update wait 10;
PROCESS A:
SQL> select * from maclean2 for update wait 10;
select * from maclean2 for update wait 10
*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
Elapsed: 00:00:06.02
这一次我们指定 select for update wait 10即10s超时, 因为 10s大于当前的_enqueue_deadlock_time_sec设置值(9s),所以Process A就会做死锁检测。 但是可以看到这里死锁检测时间为6s 而不是我们指定_enqueue_deadlock_scan_secs的4s 。
通过反复的测试可以发现,实际的死锁检测时间总是_enqueue_deadlock_scan_secs指定的值向上取整为3的倍数。
总结:
enqueue lock队列锁的死锁检测遵循以下原则
1. 默认情况下死锁检测deadlock detection总是3s发生一次, 但是实际受到参数_enqueue_deadlock_scan_secs(deadlock scan interval)的控制,该参数默认为0,死锁检测时间总是_enqueue_deadlock_scan_secs指定的值向上取整为3的倍数, 当_enqueue_deadlock_scan_secs=0 则为3s一次, 当_enqueue_deadlock_scan_secs=4则为6s一次,依此类推。
2. 死锁检测还受到_enqueue_deadlock_time_sec(requests with timeout <= this will not have deadlock detection)的影响,若enqueue request timeout< _enqueue_deadlock_time_sec(默认值为5),则Server process不会做死锁检测。若enqueue request timeout>_enqueue_deadlock_time_sec 则会参考_enqueue_deadlock_scan_secs定期做死锁检测, 当然request timeout的指定不限于select for update wait [TIMEOUT]一种。
补充:
实际在10.2.0.1中尚未引入以上的这2个hidden parameter , 大约在patchset 10.2.0.3开始引入了 _enqueue_deadlock_time_sec, 在patchset 10.2.0.5中开始引入了_enqueue_deadlock_scan_secs。
有同学指出RAC中的死锁检测默认应当为10s, 这受到隐藏参数_lm_dd_interval(dd time interval in seconds) ,该参数最早在版本8.0.6中引入。 通过实验可以验证这位同学的说法,我也表示支持, 在10g中该参数默认为60s,从11g开始默认减少到10s。 需要注意的是在11g之前_lm_dd_interval这个参数指定值是不可信任的,这是因为在11g中对LMD进程的死锁检测进行了优化,原本在节点数量较多的RAC中LMD的Deadlock Detection可能消耗更多的CPU,以下是11g中Oracle内部研发Team测试的LMD在多节点情况下的CPU使用情况:
我们进一步来了解11g中LMD检测死锁的细节,这里我们会用到11g特有的 UTS TRACE 来获取更多 DD的信息:
SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production PL/SQL Release 11.2.0.3.0 - Production CORE 11.2.0.3.0 Production TNS for Linux: Version 11.2.0.3.0 - Production NLSRTL Version 11.2.0.3.0 - Production SQL> SQL> select * from global_name 2 ; GLOBAL_NAME -------------------------------------------------------------------------------- www.askmac.cn SQL> alter system set "_lm_dd_interval"=20 scope=spfile; System altered. SQL> startup force; ORACLE instance started. Total System Global Area 1570009088 bytes Fixed Size 2228704 bytes Variable Size 1325403680 bytes Database Buffers 234881024 bytes Redo Buffers 7495680 bytes Database mounted. Database opened. SQL> set linesize 140 pagesize 1400 SQL> show parameter lm_dd NAME TYPE VALUE ------------------------------------ -------------------------------- ------------------------------ _lm_dd_interval integer 20 SQL> select count(*) from gv$instance; COUNT(*) ---------- 2 instance 1: SQL> oradebug setorapid 12 Oracle pid: 12, Unix process pid: 8608, image: oracle@vrh2.oracle.com (LMD0) 对 LMD0进程做 UTS TRACE所有RAC相关的内部操作将无所遁形 SQL> oradebug event 10046 trace name context forever,level 8:10708 trace name context forever,level 103: trace[rac.*] disk high; Statement processed. Elapsed: 00:00:00.00 SQL> update maclean1 set t1=t1+1; 1 row updated. instance 2: SQL> update maclean2 set t1=t1+1; 1 row updated. SQL> update maclean1 set t1=t1+1; Instance 1: SQL> update maclean2 set t1=t1+1; update maclean2 set t1=t1+1 * ERROR at line 1: ORA-00060: deadlock detected while waiting for resource Elapsed: 00:00:20.51 LMD0进程的UTS TRACE 2012-06-12 22:27:00.929284 : [kjmpbmsg:process][type 22][msg 0x7fa620ac85a8][from 1][seq 8148.0][len 192] 2012-06-12 22:27:00.929346 : [kjmxmpm][type 22][seq 0.0][msg 0x7fa620ac85a8][from 1] *** 2012-06-12 22:27:00.929 * kjddind: received DDIND msg with subtype x6 * reqp->dd_master_inst_kjxmddi == 1 * kjddind: dump sgh: 2012-06-12 22:27:00.929346*: kjddind: req->timestamp [0.15], kjddt [0.13] 2012-06-12 22:27:00.929346*: >> DDmsg:KJX_DD_REMOTE,TS[0.15],Inst 1->2,ddxid[id1,id2,inst:2097153,31,1],ddlock[0x95023930,829],ddMasterInst 1 2012-06-12 22:27:00.929346*: lock [0x95023930,829], op = [mast] 2012-06-12 22:27:00.929346*: reqp->timestamp [0.15], kjddt [0.13] 2012-06-12 22:27:00.929346*: kjddind: updated local timestamp [0.15] * kjddind: case KJX_DD_REMOTE 2012-06-12 22:27:00.929346*: ADD IO NODE WFG: 0 frame pointer 2012-06-12 22:27:00.929346*: PUSH: type=res, enqueue(0xffffffff.0xffffffff)=0xbbb9af40, block=KJUSEREX, snode=1 2012-06-12 22:27:00.929346*: PROCESS: type=res, enqueue(0xffffffff.0xffffffff)=0xbbb9af40, block=KJUSEREX, snode=1 2012-06-12 22:27:00.929346*: POP: type=res, enqueue(0xffffffff.0xffffffff)=0xbbb9af40, block=KJUSEREX, snode=1 2012-06-12 22:27:00.929346*: kjddopr[TX 0xe000c.0x32][ext 0x5,0x0]: blocking lock 0xbbb9a800, owner 2097154 of inst 2 2012-06-12 22:27:00.929346*: PUSH: type=txn, enqueue(0xffffffff.0xffffffff)=0xbbb9a800, block=KJUSEREX, snode=1 2012-06-12 22:27:00.929346*: PROCESS: type=txn, enqueue(0xffffffff.0xffffffff)=0xbbb9a800, block=KJUSEREX, snode=1 2012-06-12 22:27:00.929346*: ADD NODE TO WFG: type=txn, enqueue(0xffffffff.0xffffffff)=0xbbb9a800, block=KJUSEREX, snode=1 2012-06-12 22:27:00.929346*: POP: type=txn, enqueue(0xffffffff.0xffffffff)=0xbbb9a800, block=KJUSEREX, snode=1 2012-06-12 22:27:00.929346*: kjddopt: converting lock 0xbbce92f8 on 'TX' 0x80016.0x5d4,txid [2097154,34]of inst 2 2012-06-12 22:27:00.929346*: PUSH: type=res, enqueue(0xffffffff.0xffffffff)=0xbbce92f8, block=KJUSEREX, snode=1 2012-06-12 22:27:00.929346*: PROCESS: type=res, enqueue(0xffffffff.0xffffffff)=0xbbce92f8, block=KJUSEREX, snode=1 2012-06-12 22:27:00.929346*: ADD NODE TO WFG: type=res, enqueue(0xffffffff.0xffffffff)=0xbbce92f8, block=KJUSEREX, snode=1 2012-06-12 22:27:00.929855 : GSIPC:AMBUF: rcv buff 0x7fa620aa8cd8, pool rcvbuf, rqlen 1102 2012-06-12 22:27:00.929878 : GSIPC:GPBMSG: new bmsg 0x7fa620aa8d48 mb 0x7fa620aa8cd8 msg 0x7fa620aa8d68 mlen 192 dest x100 flushsz -1 2012-06-12 22:27:00.929878*: << DDmsg:KJX_DD_REMOTE,TS[0.15],Inst 2->1,ddxid[id1,id2,inst:2097153,31,1],ddlock[0x95023930,829],ddMasterInst 1 2012-06-12 22:27:00.929878*: lock [0xbbce92f8,287], op = [mast] 2012-06-12 22:27:00.929878*: ADD IO NODE WFG: 0 frame pointer 2012-06-12 22:27:00.929923 : [kjmpbmsg:compl][msg 0x7fa620ac8588][typ p][nmsgs 1][qtime 0][ptime 0] 2012-06-12 22:27:00.929947 : GSIPC:PBAT: flush start. flag 0x79 end 0 inc 4.4 2012-06-12 22:27:00.929963 : GSIPC:PBAT: send bmsg 0x7fa620aa8d48 blen 224 dest 1.0 2012-06-12 22:27:00.929979 : GSIPC:SNDQ: enq msg 0x7fa620aa8d48, type 65521 seq 8325, inst 1, receiver 0, queued 1 012-06-12 22:27:00.929979 : GSIPC:SNDQ: enq msg 0x7fa620aa8d48, type 65521 seq 8325, inst 1, receiver 0, queued 1 2012-06-12 22:27:00.929996 : GSIPC:BSEND: flushing sndq 0xb491dd28, id 0, dcx 0xbc517770, inst 1, rcvr 0 qlen 0 1 2012-06-12 22:27:00.930014 : GSIPC:BSEND: no batch1 msg 0x7fa620aa8d48 type 65521 len 224 dest (1:0) 2012-06-12 22:27:00.930088 : kjbsentscn[0x0.3f72dc][to 1] 2012-06-12 22:27:00.930144 : GSIPC:SENDM: send msg 0x7fa620aa8d48 dest x10000 seq 8325 type 65521 tkts x1 mlen xe00110 2012-06-12 22:27:00.930531 : GSIPC:KSXPCB: msg 0x7fa620aa8d48 status 30, type 65521, dest 1, rcvr 0 WAIT #0: nam='ges remote message' ela= 1372 waittime=80 loop=0 p3=74 obj#=-1 tim=1339554420931640 2012-06-12 22:27:00.931728 : GSIPC:RCVD: ksxp msg 0x7fa620af6490 sndr 1 seq 0.8149 type 65521 tkts 1 2012-06-12 22:27:00.931746 : GSIPC:RCVD: watq msg 0x7fa620af6490 sndr 1, seq 8149, type 65521, tkts 1 2012-06-12 22:27:00.931763 : GSIPC:RCVD: seq update (0.8148)->(0.8149) tp -15 fg 0x4 from 1 pbattr 0x0 2012-06-12 22:27:00.931779 : GSIPC:TKT: collect msg 0x7fa620af6490 from 1 for rcvr 0, tickets 1 2012-06-12 22:27:00.931794 : kjbrcvdscn[0x0.3f72dc][from 1][idx 2012-06-12 22:27:00.931810 : kjbrcvdscn[no bscn dd_master_inst_kjxmddi == 1 * kjddind: dump sgh: NXTIN (nil) 0 wq 0 cvtops x0 0x0.0x0(ext 0x0,0x0)[0000-0000-00000000] inst 1 BLOCKER 0xbbb9a800 5 wq 1 cvtops x28 TX 0xe000c.0x32(ext 0x5,0x0)[20000-0002-00000022] inst 2 BLOCKED 0xbbce92f8 5 wq 2 cvtops x1 TX 0x80016.0x5d4(ext 0x2,0x0)[20000-0002-00000022] inst 2 NXTOUT (nil) 0 wq 0 cvtops x0 0x0.0x0(ext 0x0,0x0)[0000-0000-00000000] inst 1 2012-06-12 22:27:00.932058*: kjddind: req->timestamp [0.15], kjddt [0.15] 2012-06-12 22:27:00.932058*: >> DDmsg:KJX_DD_VALIDATE,TS[0.15],Inst 1->2,ddxid[id1,id2,inst:2097153,31,1],ddlock[0x95023930,829],ddMasterInst 1 2012-06-12 22:27:00.932058*: lock [(nil),0], op = [vald_dd] 2012-06-12 22:27:00.932058*: kjddind: updated local timestamp [0.15] * kjddind: case KJX_DD_VALIDATE *** 2012-06-12 22:27:00.932 * kjddvald called: kjxmddi stuff: * cont_lockp (nil) * dd_lockp 0x95023930 * dd_inst 1 * dd_master_inst 1 * sgh graph: NXTIN (nil) 0 wq 0 cvtops x0 0x0.0x0(ext 0x0,0x0)[0000-0000-00000000] inst 1 BLOCKER 0xbbb9a800 5 wq 1 cvtops x28 TX 0xe000c.0x32(ext 0x5,0x0)[20000-0002-00000022] inst 2 BLOCKED 0xbbce92f8 5 wq 2 cvtops x1 TX 0x80016.0x5d4(ext 0x2,0x0)[20000-0002-00000022] inst 2 NXTOUT (nil) 0 wq 0 cvtops x0 0x0.0x0(ext 0x0,0x0)[0000-0000-00000000] inst 1 POP WFG NODE: lock=(nil) * kjddvald: dump the PRQ: BLOCKER 0xbbb9a800 5 wq 1 cvtops x28 TX 0xe000c.0x32(ext 0x5,0x0)[20000-0002-00000022] inst 2 BLOCKED 0xbbce92f8 5 wq 2 cvtops x1 TX 0x80016.0x5d4(ext 0x2,0x0)[20000-0002-00000022] inst 2 * kjddvald: KJDD_NXTONOD ->node_kjddsg.dinst_kjddnd =1 * kjddvald: ... which is not my node, my subgraph is validated but the cycle is not complete Global blockers dump start:--------------------------------- DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x80016][0x5d4],[TX][ext 0x2,0x0] 发现dead lock!!!
可以看到在以上11.2.0.3的测试中 RAC LMD的死锁检测时间确实受到”_lm_dd_interval”参数的影响变成了指定值的20s。 需要注意的是在10g中_lm_dd_interval默认为60s,在节点较少且Processes数不多的环境中这个参数往往是可信的,但是如果节点很多且有大量Server Process则死锁未必能在这60s内检测到。在11g中对该参数(实际优化的是LMD的死锁检测算法)做了充分的优化,一般可以信任其在默认的10s内完成 Enqueue Deadlock Detection。
在11g中控制 RAC中LMD进程检测死锁的hidden parameter 还不仅于”_lm_dd_interval”这一个,RAC的内部机制永远是一个有意思的话题,但是今天我们就聊到这里:
SQL> col name for a50 SQL> col describ for a60 SQL> col value for a20 SQL> set linesize 140 pagesize 1400 SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE, x.ksppdesc describ 2 FROM SYS.x$ksppi x, SYS.x$ksppcv y 3 WHERE x.inst_id = USERENV ('Instance') 4 AND y.inst_id = USERENV ('Instance') 5 AND x.indx = y.indx 6 AND x.ksppinm like '_lm_dd%'; NAME VALUE DESCRIB -------------------------------------------------- -------------------- ------------------------------------------------------------ _lm_dd_interval 20 dd time interval in seconds _lm_dd_scan_interval 5 dd scan interval in seconds _lm_dd_search_cnt 3 number of dd search per token get _lm_dd_max_search_time 180 max dd search time per token _lm_dd_maxdump 50 max number of locks to be dumped during dd validation _lm_dd_ignore_nodd FALSE if TRUE nodeadlockwait/nodeadlockblock options are ignored 6 rows selected.