在之前的文章《latch free:cache buffer handles造成的SQL性能问题》中我介绍了cache buffer handle latch的一些知识,在这里我们复习一下:
“当会话需要pin住buffer header时它首先要获去buffer handle,得到buffer handle的过程中首先要抢占cache buffer handles栓,为了避免对于cache buffer handles栓的过度争用,每个会话被允许cache一小撮buffer handles,也叫保留集(reserved set)。该保留集的上限由隐式参数_db_handles_cached(默认为5)所控制,在此基础上会话在执行不是十分复杂的SQL时不必反复申请栓。
同时存在一些十分复杂的语句,需要同时pin住大量的缓存,此时若存在空闲的handle,会话被允许索要手头5个cached buffer handles以外的handle。也为了限制单个会话的资源占用量,Oracle通过内部算法总是尽量让每个进程能够pin住”合理份额“的buffer,由此单个会话同一时间所能持有的总handle数等于db_block_buffers/processes,隐式参数_cursor_db_buffers_pinned指出了该值。另cache buffer handles并没有子栓,仅有父栓本身,所以如果系统中有大量复杂SQL,会话所能cache的buffer handles远不足以满足需求时,该栓就成为整个系统性能的瓶颈,甚至可能出现潜在的死锁/饿死问题。”
应网友在T.askmac.cn上的提问,我们介绍更多cache Buffer handle相关的知识:
cache buffer handle 结构:
------------------------------ | Buffer state object | ------------------------------ | Place to hang the buffer | ------------------------------ | Consistent Get? | ------------------------------ | Proc Owning SO | ------------------------------ | Flags(RIR) | ------------------------------
来看一个 cache buffer handle
SO: 70000046fdfe530, type: 24, owner: 70000041b018630, flag: INIT/-/-/0x00
(buffer) (CR) PR: 70000048e92d148 FLG: 0x500000
lock rls: 0, class bit: 0
kcbbfbp: [BH: 7000001c7f069b0, LINK: 70000046fdfe570]
where: kdswh02: kdsgrp, why: 0
BH (7000001c7f069b0) file#: 12 rdba: 0x03061612 (12/398866) class: 1 ba: 7000001c70ee000
set: 75 blksize: 8192 bsi: 0 set-flg: 0 pwbcnt: 0
dbwrid: 2 obj: 66209 objn: 48710 tsn: 6 afn: 12
hash: [700000485f12138,700000485f12138] lru: [70000025af67790,700000132f69ee0]
lru-flags: hot_buffer
ckptq: [NULL] fileq: [NULL] objq: [700000114f5dd10,70000028bf5d620]
use: [70000046fdfe570,70000046fdfe570] wait: [NULL]
st: SCURRENT md: SHR tch: 0
flags: affinity_lock
LRBA: [0x0.0.0] HSCN: [0xffff.ffffffff] HSUB: [65535]
where: kdswh02: kdsgrp, why: 0
# Example:
# (buffer) (CR) PR: 37290 FLG: 0
# kcbbfbp : [BH: befd8, LINK: 7836c] (WAITING)
Buffer handle (X$KCBBF) kernel cache, buffer buffer_handles
Query x$kcbbf – lists all the buffer handles
相关参数
_db_handles System-wide simultaneous buffer operations ,no of buffer handles
_db_handles_cached Buffer handles cached each process , no of processes default 5
_cursor_db_buffers_pinned additional number of buffers a cursor can pin at once
_session_kept_cursor_pins Number of cursors pins to keep in a session
When a buffer is pinned it is attached to buffer state object.
接下来 我们实际体验一下 cache buffer handles latch 和 buffer pin的影响:
SESSION A : SQL> select * from v$version; BANNER ---------------------------------------------------------------- Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi PL/SQL Release 10.2.0.5.0 - Production CORE 10.2.0.5.0 Production TNS for Linux: Version 10.2.0.5.0 - Production NLSRTL Version 10.2.0.5.0 - Production SQL> create table test_cbc_handle(t1 int); Table created. SQL> insert into test_cbc_handle values(1); 1 row created. SQL> commit; Commit complete. SQL> select rowid from test_cbc_handle; ROWID ------------------ AAANO6AABAAAQZSAAA SQL> select * from test_cbc_handle where rowid='AAANO6AABAAAQZSAAA'; T1 ---------- 1 SQL> select addr,name from v$latch_parent where name='cache buffer handles'; ADDR NAME ---------------- -------------------------------------------------- 00000000600140A8 cache buffer handles SQL> select to_number('00000000600140A8','xxxxxxxxxxxxxxxxxxxx') from dual; TO_NUMBER('00000000600140A8','XXXXXXXXXXXXXXXXXXXX') ---------------------------------------------------- 1610694824 注意cache buffer handles只有一个parent latch 而没有 child latch 我们让SESSION A hold 住唯一的一个cache buffer handles parent latch 这会用到 oradebug call kslgetl 命令, kslgetl是oracle内部get latch的函数 SQL> oradebug setmypid; Statement processed. SQL> oradebug call kslgetl 1610694824 1; Function returned 1 另外开一个SESSION B 来观察: SQL> select * from v$latchholder; PID SID LADDR NAME GETS ---------- ---------- ---------------- ---------------------------------------------------------------- ---------- 15 141 00000000600140A8 cache buffer handles 119 cache buffer handles latch 确实被session A hold住了,这将阻塞其他进程acquire cache buffer handle latch SQL> select * from test_cbc_handle where rowid='AAANO6AABAAAQZSAAA'; T1 ---------- 1 但此时其他Server Process还是可以正常 read buffer, 这是因为隐藏参数"_db_handles_cached", 默认process会cache 5个 cache buffer handle 修改"_db_handles_cached"=0,让process不再拥有5个缓存的cache buffer handle , 这将导致 process 若需要pin buffer,则必须hold cache buffer handle latch以便分配一个cache buffer handle SQL> alter system set "_db_handles_cached"=0 scope=spfile; System altered. 并重启实例 shutdown immediate; startup; session A: SQL> oradebug setmypid; Statement processed. SQL> oradebug call kslgetl 1610694824 1; Function returned 1 session B: select * from test_cbc_handle where rowid='AAANO6AABAAAQZSAAA'; session B hang!! WHY? SQL> oradebug setmypid; Statement processed. SQL> oradebug dump systemstate 266; Statement processed. SO: 0x11b30b7b0, type: 2, owner: (nil), flag: INIT/-/-/0x00 (process) Oracle pid=22, calls cur/top: (nil)/0x11b453c38, flag: (0) - int error: 0, call error: 0, sess error: 0, txn error 0 (post info) last post received: 0 0 0 last post received-location: No post last process to post me: none last post sent: 0 0 0 last post sent-location: No post last process posted by me: none (latch info) wait_event=0 bits=8 holding (efd=4) 600140a8 cache buffer handles level=3 SO: 0x11b305810, type: 2, owner: (nil), flag: INIT/-/-/0x00 (process) Oracle pid=10, calls cur/top: 0x11b455ac0/0x11b450a58, flag: (0) - int error: 0, call error: 0, sess error: 0, txn error 0 (post info) last post received: 0 0 0 last post received-location: No post last process to post me: none last post sent: 0 0 0 last post sent-location: No post last process posted by me: none (latch info) wait_event=0 bits=2 Location from where call was made: kcbzgs: waiting for 600140a8 cache buffer handles level=3 FBD93353:000019F0 10 162 10005 1 KSL WAIT BEG [latch: cache buffer handles] 1610694824/0x600140a8 125/0x7d 0/0x0 FF936584:00002761 10 144 10005 1 KSL WAIT BEG [latch: cache buffer handles] 1610694824/0x600140a8 125/0x7d 0/0x0 PID=22 holding 持有cache buffer handles latch PID=10 等待 cache buffer handles latch, 这是因为"_db_handles_cached"=0 导致 process自身没有缓存cache buffer handles 这个systemstate中找不到 kcbbfbp cache buffer handle对象, 因为 "_db_handles_cached"=0 且 cache buffer handles latch被hold 住了 我们释放cache buffer handles latch , 来观察 buffer被pin住而不是释放时的情境 session A exit session B: SQL> select * from v$latchholder; no rows selected SQL> insert into test_cbc_handle values(2); 1 row created. SQL> commit; Commit complete. SQL> SQL> select t1,rowid from test_cbc_handle; T1 ROWID ---------- ------------------ 1 AAANPAAABAAAQZSAAA 2 AAANPAAABAAAQZSAAB SQL> select spid,pid from v$process where addr = ( select paddr from v$session where sid=(select distinct sid from v$mystat)); SPID PID ------------ ---------- 19251 10 用 GDB 对 SPID=19215 做debug , 使用 kcbrls 函数作为breakpoint 这会让 进程无法release buffer [oracle@vrh8 ~]$ gdb $ORACLE_HOME/bin/oracle 19251 GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-37.el5) Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /s01/oracle/product/10.2.0.5/db_1/bin/oracle...(no debugging symbols found)...done. Attaching to program: /s01/oracle/product/10.2.0.5/db_1/bin/oracle, process 19251 Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libskgxp10.so...(no debugging symbols found)...done. Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libskgxp10.so Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libhasgen10.so...(no debugging symbols found)...done. Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libhasgen10.so Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libskgxn2.so...(no debugging symbols found)...done. Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libskgxn2.so Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libocr10.so...(no debugging symbols found)...done. Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libocr10.so Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libocrb10.so...(no debugging symbols found)...done. Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libocrb10.so Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libocrutl10.so...(no debugging symbols found)...done. Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libocrutl10.so Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libjox10.so...(no debugging symbols found)...done. Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libjox10.so Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libclsra10.so...(no debugging symbols found)...done. Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libclsra10.so Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libdbcfg10.so...(no debugging symbols found)...done. Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libdbcfg10.so Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libnnz10.so...(no debugging symbols found)...done. Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libnnz10.so Reading symbols from /usr/lib64/libaio.so.1...(no debugging symbols found)...done. Loaded symbols for /usr/lib64/libaio.so.1 Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/libdl.so.2 Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done. Loaded symbols for /lib64/libm.so.6 Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done. [Thread debugging using libthread_db enabled] Loaded symbols for /lib64/libpthread.so.0 Reading symbols from /lib64/libnsl.so.1...(no debugging symbols found)...done. Loaded symbols for /lib64/libnsl.so.1 Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done. Loaded symbols for /lib64/libc.so.6 Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/libnss_files.so.2 0x00000035c000d940 in __read_nocancel () from /lib64/libpthread.so.0 (gdb) break kcbrls Breakpoint 1 at 0x10e5d24 session B: select * from test_cbc_handle where rowid='AAANPAAABAAAQZSAAA'; select hang !! GDB (gdb) c Continuing. Breakpoint 1, 0x00000000010e5d24 in kcbrls () (gdb) bt #0 0x00000000010e5d24 in kcbrls () #1 0x0000000002e87d25 in qertbFetchByUserRowID () #2 0x00000000030c62b8 in opifch2 () #3 0x00000000032327f0 in kpoal8 () #4 0x00000000013b7c10 in opiodr () #5 0x0000000003c3c9da in ttcpip () #6 0x00000000013b3144 in opitsk () #7 0x00000000013b60ec in opiino () #8 0x00000000013b7c10 in opiodr () #9 0x00000000013a92f8 in opidrv () #10 0x0000000001fa3936 in sou2o () #11 0x000000000072d40b in opimai_real () #12 0x000000000072d35c in main () SQL> oradebug setmypid; Statement processed. SQL> oradebug dump systemstate 266; Statement processed. 此时可以看到 kcbbfbp buffer cache handle 将 SO state object 和 BH BUFFER HEADER link在一起 ---------------------------------------- SO: 0x11b452348, type: 3, owner: 0x11b305810, flag: INIT/-/-/0x00 (call) sess: cur 11b41bd18, rec 0, usr 11b41bd18; depth: 0 ---------------------------------------- SO: 0x1182dc750, type: 24, owner: 0x11b452348, flag: INIT/-/-/0x00 (buffer) (CR) PR: 0x11b305810 FLG: 0x108000 class bit: (nil) kcbbfbp: [BH: 0xf2fc69f8, LINK: 0x1182dc790] where: kdswh05: kdsgrp, why: 0 BH (0xf2fc69f8) file#: 1 rdba: 0x00410652 (1/67154) class: 1 ba: 0xf297c000 set: 3 blksize: 8192 bsi: 0 set-flg: 2 pwbcnt: 272 dbwrid: 0 obj: 54208 objn: 54202 tsn: 0 afn: 1 hash: [f2fc47f8,1181f3038] lru: [f2fc6b88,f2fc6968] obj-flags: object_ckpt_list ckptq: [1182ecf38,1182ecf38] fileq: [1182ecf58,1182ecf58] objq: [108712a28,108712a28] use: [1182dc790,1182dc790] wait: [NULL] st: XCURRENT md: SHR tch: 12 flags: buffer_dirty gotten_in_current_mode block_written_once redo_since_read LRBA: [0xc7.73b.0] HSCN: [0x0.1cbe52] HSUB: [1] Using State Objects ---------------------------------------- SO: 0x1182dc750, type: 24, owner: 0x11b452348, flag: INIT/-/-/0x00 (buffer) (CR) PR: 0x11b305810 FLG: 0x108000 class bit: (nil) kcbbfbp: [BH: 0xf2fc69f8, LINK: 0x1182dc790] where: kdswh05: kdsgrp, why: 0 buffer tsn: 0 rdba: 0x00410652 (1/67154) scn: 0x0000.001cbe52 seq: 0x01 flg: 0x02 tail: 0xbe520601 frmt: 0x02 chkval: 0x0000 type: 0x06=trans data tab 0, row 0, @0x1f9a tl: 6 fb: --H-FL-- lb: 0x0 cc: 1 col 0: [ 2] c1 02 tab 0, row 1, @0x1f94 tl: 6 fb: --H-FL-- lb: 0x2 cc: 1 col 0: [ 2] c1 15 end_of_block_dump (buffer) (CR) PR: 0x11b305810 FLG: 0x108000 st: XCURRENT md: SHR tch: 12 该 buffer header的status= XCURRENT mode=KCBMSHARE KCBMSHR current share 也可以通过 x$kcbbf 找到相关的 cache buffer handle SQL> select distinct KCBBPBH from x$kcbbf ; KCBBPBH ---------------- 00 00000000F2FC69F8 ==>0xf2fc69f8 SQL> select * from x$kcbbf where kcbbpbh='00000000F2FC69F8'; ADDR INDX INST_ID KCBBFSO_TYP KCBBFSO_FLG KCBBFSO_OWN ---------------- ---------- ---------- ----------- ----------- ---------------- KCBBFFLG KCBBFCR KCBBFCM KCBBFMBR KCBBPBH ---------- ---------- ---------- ---------------- ---------------- KCBBPBF X0KCBBPBH X0KCBBPBF X1KCBBPBH ---------------- ---------------- ---------------- ---------------- X1KCBBPBF KCBBFBH KCBBFWHR KCBBFWHY ---------------- ---------------- ---------- ---------- 00000001182DC750 748 1 24 1 000000011B452348 1081344 1 0 00 00000000F2FC69F8 00000001182DC750 00 00000001182DC750 00 00000001182DC7F8 00 583 0 SQL> desc x$kcbbf; Name Null? Type ----------------------------------------- -------- ---------------------------- ADDR RAW(8) INDX NUMBER INST_ID NUMBER KCBBFSO_TYP NUMBER KCBBFSO_FLG NUMBER KCBBFSO_OWN RAW(8) KCBBFFLG NUMBER KCBBFCR NUMBER KCBBFCM NUMBER KCBBFMBR RAW(8) KCBBPBH RAW(8) KCBBPBF RAW(8) X0KCBBPBH RAW(8) X0KCBBPBF RAW(8) X1KCBBPBH RAW(8) X1KCBBPBF RAW(8) KCBBFBH RAW(8) KCBBFWHR NUMBER KCBBFWHY NUMBER gdb 退出 让process能够顺利完成kcbrls release buffer后 相关的cache buffer handle被清理 SQL> select distinct KCBBPBH from x$kcbbf ; KCBBPBH ---------------- 00