问题来源于 网友在Ask Maclean Home 上关于 RAC 中 Past Image PI块的提问, 网友提出了一个演示,我们在11.2.0.3 2 Node RAC的环境中重现这个实验:
SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production PL/SQL Release 11.2.0.3.0 - Production CORE 11.2.0.3.0 Production TNS for Linux: Version 11.2.0.3.0 - Production NLSRTL Version 11.2.0.3.0 - Production SQL> select * from global_name; GLOBAL_NAME -------------------------------------------------------------------------------- www.askmac.cn SQL> drop table test purge; Table dropped. SQL> alter system flush buffer_cache; System altered. SQL> create table test(id number); insert into test values(1); insert into test values(2); commit; /* 我们利用 rowid定位TEST表仅有的2行数据的数据块位置 */ select dbms_rowid.rowid_block_number(rowid),dbms_rowid.rowid_relative_fno(rowid) from test; DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID) ------------------------------------ ------------------------------------ 89233 1 89233 1 SQL> alter system flush buffer_cache; System altered. Instance 1 Session A 执行UPDATE操作: SQL> update test set id=id+1 where id=1; 1 row updated. Instance 1 Session B 查询x$BH buffer header视图 了解 相关Buffer的状态 SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0; STATE CR_SCN_BAS ---------- ---------- 1 0 3 1227595
X$BH 视图的 STATE字段代表Buffer的状态, 以下是状态列表:
STATE NUMBER KCBBHFREE 0 buffer free KCBBHEXLCUR 1 buffer current (and if DFS locked X) KCBBHSHRCUR 2 buffer current (and if DFS locked S) KCBBHCR 3 buffer consistant read KCBBHREADING 4 Being read KCBBHMRECOVERY 5 media recovery (current & special) KCBBHIRECOVERY 6 Instance recovery (somewhat special)
这个演示中我们需要用到的是 : state =1 Xcurrent 、 state=2 Scurrent 、 state=3 CR
接着在 Instance 2 更新 同一个数据块内的另一条记录 ,这回引发 gc current block 2 way 并将Current Block 传输到 Instance 2, 同时 Instance 1 的原”Current Block” Convert 成 Past Image:
Instance 2 Session C SQL> update test set id=id+1 where id=2; 1 row updated. Instance 2 Session D SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0; STATE CR_SCN_BAS ---------- ---------- 1 0 3 1227641 3 1227638
STATE =1 的Xcurrent block已传输到 Instance 2 , 再来看 Instance 1 此时的 GC状态:
Instance 1 Session B SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0; STATE CR_SCN_BAS ---------- ---------- 3 1227641 3 1227638 8 0 3 1227595
问题出现在这里, 当网友再次在Instance 1上的session A中执行对TEST表的SELECT查询后 ,发现原来的 3个 State=3的CR 块 数量减少到了1个:
Instance 1 session A 即最初执行UPDATE的 session SQL> alter session set events '10046 trace name context forever,level 8:10708 trace name context forever,level 103: trace[rac.*] disk high'; Session altered. SQL> select * from test; ID ---------- 2 2 select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0; STATE CR_SCN_BAS ---------- ---------- 3 1227716 3 1227713 8 0
网友在原帖中是通过v$BH 视图观察CR块的数量,发现在执行SELECT查询后 CR块数量反而减少了,故而产生了疑问。 我们在以上演示中直接 观察X$BH视图可以发现 , 原本的三个CR块的SCN Version分别为: 1227641、1227638、1227595, 在SELECT查询完成后被 2个不同SCN version的CR块 1227716和 1227713 所替换, Oracle为什么要这样做呢?
所幸我们在实际执行SELECT查询前设置了event 10708和 rac.*的诊断TRACE,我们先来看看TRACE内容:
PARSING IN CURSOR #140444679938584 len=337 dep=1 uid=0 oct=3 lid=0 tim=1335698913632292 hv=3345277572 ad='bc0e68c8' sqlid='baj7tjm3q9sn4' SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0) FROM (SELECT /*+ NO_PARALLEL("TEST") FULL("TEST") NO_PARALLEL_INDEX("TEST") */ 1 AS C1, 1 AS C2 FROM "SYS"."TEST" "TEST") SAMPLESUB END OF STMT PARSE #140444679938584:c=1000,e=27630,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=1950795681,tim=1335698913632252 EXEC #140444679938584:c=0,e=44,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,plh=1950795681,tim=1335698913632390 *** 2012-04-29 07:28:33.632 kclscrs: req=0 block=1/89233 *** 2012-04-29 07:28:33.632 kclscrs: bid=1:3:1:0:7:80:1:0:4:0:0:0:1:2:4:1:26:0:0:0:70:1a:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:4:3:2:1:2:0:3f:0:1c:86:2d:4:0:0:0:0:a2:3c:7c:b:70:1a:0:0:0:0:1:0:7a:f8:76:1d:1:2:dc:5:a9:fe:17:75:0:0:0:0:0:0:0:0:0:0:0:0:63:e5:0:0:0:0:0:0:10:0:0:0 2012-04-29 07:28:33.632578 : kjbcrc[0x15c91.1 76896.0][9] 2012-04-29 07:28:33.632616 : GSIPC:GMBQ: buff 0xba1e8f90, queue 0xbb79f278, pool 0x60013fa0, freeq 1, nxt 0xbb79f278, prv 0xbb79f278 2012-04-29 07:28:33.632634 : kjbmscrc(0x15c91.1)seq 0x2 reqid=0x1c(shadow 0xb4bb4458,reqid x1c)mas@2(infosz 200)(direct 1) 2012-04-29 07:28:33.632654 : kjbsentscn[0x0.12bbc1][to 2] 2012-04-29 07:28:33.632669 : GSIPC:SENDM: send msg 0xba1e9000 dest x20001 seq 24026 type 32 tkts xff0000 mlen x17001a0 2012-04-29 07:28:33.633385 : GSIPC:KSXPCB: msg 0xba1e9000 status 30, type 32, dest 2, rcvr 1 *** 2012-04-29 07:28:33.633 kclwcrs: wait=0 tm=689 *** 2012-04-29 07:28:33.633 kclwcrs: got 1 blocks from ksxprcv WAIT #140444679938584: nam='gc cr block 2-way' ela= 689 p1=1 p2=89233 p3=1 obj#=76896 tim=1335698913633418 2012-04-29 07:28:33.633490 : kjbcrcomplete[0x15c91.1 76896.0][0] 2012-04-29 07:28:33.633510 : kjbrcvdscn[0x0.12bbc1][from 2][idx 2012-04-29 07:28:33.633527 : kjbrcvdscn[no bscn <= rscn 0x0.12bbc1][from 2] *** 2012-04-29 07:28:33.633 kclwcrs: req=0 typ=cr(2) wtyp=2hop tm=689
通过TRACE不难发现 因为之前没有收集过TEST表的统计信息, 所以这里出发了Dynamic Sampling的动态采样,这本身会引发对TEST表的 CR读请求,实际产生了一次’gc cr block 2-way’ 等待:
2012-04-29 07:28:33.632654 : kjbsentscn[0x0.12bbc1][to 2]
12bbc1= 1227713 与上述X$BH中的一个CR块对应,kjbsentscn[0x0.12bbc1][to 2] 可以理解为 向 Instance 2 发送了SCN=12bbc1=1227713 DBA=0x15c91.1 76896.0 的 CR Request(obj#=76896)
之后kjbrcvdscn函数确认了 [no bscn <= rscn 0x0.12bbc1][from 2] ,即没有 比已receive的 SCN Version =12bbc1 更好的Best Version
CR Server Arch
动态采样完成后才真正执行了用户发出的SELECT语句:
PARSING IN CURSOR #140444682869592 len=18 dep=0 uid=0 oct=3 lid=0 tim=1335698913635874 hv=1689401402 ad='b1a188f0' sqlid='c99yw1xkb4f1u' select * from test END OF STMT PARSE #140444682869592:c=4999,e=34017,p=0,cr=7,cu=0,mis=1,r=0,dep=0,og=1,plh=1357081020,tim=1335698913635870 EXEC #140444682869592:c=0,e=23,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=1335698913635939 WAIT #140444682869592: nam='SQL*Net message to client' ela= 7 driver id=1650815232 #bytes=1 p3=0 obj#=0 tim=1335698913636071 *** 2012-04-29 07:28:33.636 kclscrs: req=0 block=1/89233 *** 2012-04-29 07:28:33.636 kclscrs: bid=1:3:1:0:7:83:1:0:4:0:0:0:1:2:4:1:26:0:0:0:70:1a:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:4:3:2:1:2:0:2:0:1c:86:2d:4:0:0:0:0:a2:3c:7c:b:70:1a:0:0:0:0:1:0:7d:f8:76:1d:1:2:dc:5:a9:fe:17:75:0:0:0:0:0:0:0:0:0:0:0:0:63:e5:0:0:0:0:0:0:10:0:0:0 2012-04-29 07:28:33.636209 : kjbcrc[0x15c91.1 76896.0][9] 2012-04-29 07:28:33.636228 : GSIPC:GMBQ: buff 0xba0e5d50, queue 0xbb79f278, pool 0x60013fa0, freeq 1, nxt 0xbb79f278, prv 0xbb79f278 2012-04-29 07:28:33.636244 : kjbmscrc(0x15c91.1)seq 0x3 reqid=0x1d(shadow 0xb4bb4458,reqid x1d)mas@2(infosz 200)(direct 1) 2012-04-29 07:28:33.636252 : kjbsentscn[0x0.12bbc4][to 2] 2012-04-29 07:28:33.636358 : GSIPC:SENDM: send msg 0xba0e5dc0 dest x20001 seq 24029 type 32 tkts xff0000 mlen x17001a0 2012-04-29 07:28:33.636861 : GSIPC:KSXPCB: msg 0xba0e5dc0 status 30, type 32, dest 2, rcvr 1 *** 2012-04-29 07:28:33.637 kclwcrs: wait=0 tm=865 *** 2012-04-29 07:28:33.637 kclwcrs: got 1 blocks from ksxprcv WAIT #140444682869592: nam='gc cr block 2-way' ela= 865 p1=1 p2=89233 p3=1 obj#=76896 tim=1335698913637294 2012-04-29 07:28:33.637356 : kjbcrcomplete[0x15c91.1 76896.0][0] 2012-04-29 07:28:33.637374 : kjbrcvdscn[0x0.12bbc4][from 2][idx 2012-04-29 07:28:33.637389 : kjbrcvdscn[no bscn <= rscn 0x0.12bbc4][from 2] *** 2012-04-29 07:28:33.637 kclwcrs: req=0 typ=cr(2) wtyp=2hop tm=865
类似的, “SELECT * FROM TEST”也引发了一次’gc cr block 2-way’等待:
2012-04-29 07:28:33.637374 : kjbrcvdscn[0x0.12bbc4][from 2][idx 2012-04-29 07:28:33.637389 : kjbrcvdscn[no bscn
最后Foreground Process从 Remote LMS哪里got的是 SCN=1227716 Version的CR, 同样与 之前我们从X$BH 视图查到的scn对应。
这样就可以解释为什么Instance 1上出现了2个SCN更大的CR块, 但仍无法解释原来存在于Instance 1 Buffer Cache中的 三个SCN Version 较小的CR 块消失的原因。
我们来看下面的演示:
SQL> alter system set "_enable_minscn_cr"=false scope=spfile; System altered. SQL> alter system set "_db_block_max_cr_dba"=20 scope=spfile; System altered. SQL> startup force; ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance ORACLE instance started. Total System Global Area 1570009088 bytes Fixed Size 2228704 bytes Variable Size 989859360 bytes Database Buffers 570425344 bytes Redo Buffers 7495680 bytes Database mounted. Database opened.
设置以上 “_enable_minscn_cr”=false 和 “_db_block_max_cr_dba”=20 并重启RAC所有实例, 重现上述演示:
在Instance 2 Session C 中update更新一次数据块 就对应地在Instance 1 中查询一次 ,以反复在Instance 1中Request CR SQL> update test set id=id+1 where id=2; -- Instance 2 1 row updated. SQL> select * From test; -- Instance 1 ID ---------- 1 2 下面为 Instance 1的 X$BH记录 select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0; STATE CR_SCN_BAS ---------- ---------- 3 1273080 3 1273071 3 1273041 3 1273039 8 0 SQL> update test set id=id+1 where id=3; 1 row updated. SQL> select * From test; ID ---------- 1 2 SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0; STATE CR_SCN_BAS ---------- ---------- 3 1273091 3 1273080 3 1273071 3 1273041 3 1273039 8 0 ................... SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0; STATE CR_SCN_BAS ---------- ---------- 3 1273793 3 1273782 3 1273780 3 1273769 3 1273734 3 1273715 3 1273691 3 1273679 3 1273670 3 1273643 3 1273635 3 1273623 3 1273106 3 1273091 3 1273080 3 1273071 3 1273041 3 1273039 3 1273033 19 rows selected. SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0; STATE CR_SCN_BAS ---------- ---------- 3 1274993
如上述演示 在设置了 “_enable_minscn_cr”(enable/disable minscn optimization for CR)=false 和 “_db_block_max_cr_dba”=20 (Maximum Allowed Number of CR buffers per dba) 2个 参数后 最多的时候 Instance 1 中缓存了同一个数据块的 多达 19个版本的CR块。
“_enable_minscn_cr”是11g以后出现的新隐藏参数,它控制Oracle是否计算CR块的最小SCN,当Foreground Process Receive接收到同一个数据块的更新(SCN更大)的SCN Version CR Block时可能会清除CBC上的 SCN较小的、旧的CR块 , 这样做的目的是减少Buffer Cache中同一个数据块 不同版本SCN Version的CR块的数量,注意不管是 语句级别或者事务级别 其所要求的Snap_Scn 快照 SCN总是 语句或事务开始时的Current SCN, 保留一些旧的CR块虽然可能对一些持续时间长的查询或者游标有益, 但是实例Buffer Cache中 同一个数据块的多版本 CR块的总数量是有限的, 这个总数受到 “_db_block_max_cr_dba” 隐藏参数的控制, 如我们上述演示中设置为20 ,则最多可以在Buffer Cache中缓存多大19个版本的CR块; 注意该”_db_block_max_cr_dba” 参数的默认值为6 , 即一个实例Buffer cache中同一个数据块的CR 版本同时不多于6个。
引入”_enable_minscn_cr” 优化CR的最小MINSCN 是有其理由的, 即便那些版本较旧的CR块被新陈代谢了, 但只要需要 Foreground Process还是可以通过CR Request , 要求 Holder Instance LMS 去build一个 BEST CR 来用, 这不消我们去担心。