还原真实的cache recovery

我们在学习Oracle基础知识的时候会了解到实例恢复(Instance Recovery)或者说崩溃恢复(Crash recovery)的概念,有时候甚至于这2个名词在我们日常的语言中表达同样的意思。实际上Instance Recovery与Crash Recovery是存在区别的:针对单实例(single instance)或者RAC中所有节点全部崩溃后的恢复,我们称之为Crash Recovery。而对于RAC中的某一个节点失败,存活节点(surviving instance)试图对失败节点线程上redo做应用的情况,我们称之为Instance Recovery。

不管是Instance Recovery还是Crash Recovery,都由2个部分组成:cache recovery继以transaction recovery。

根据官方文档的介绍,Cache Recovery也叫Rolling Forward,就是我们常说的前滚;而Transaction Recovery也叫Rolling Back,就是我们常说的回滚。前滚和回滚贯穿Oracle恢复的基本概念,是我们入门必要学习的知识,在次不多介绍。

有文事者,必济之以武略。理论学得再好,不实践也无用。所幸Crash Recovery是很容易做成的实验,我们不妨来看一看:

SQL> shutdown abort;
ORACLE instance shut down.

SQL> startup mount;
ORACLE instance started.

Total System Global Area 1065353216 bytes
Fixed Size                  2089336 bytes
Variable Size             486542984 bytes
Database Buffers          570425344 bytes
Redo Buffers                6295552 bytes
Database mounted.

SQL> alter database open;

Crash Recovery将从alter database open开始,我们来观察其日志

==================alert.log====================
alter database open
Tue Jun 14 18:19:53 2011
Beginning crash recovery of 1 threads
 parallel recovery started with 2 processes
Tue Jun 14 18:19:53 2011
Started redo scan
Tue Jun 14 18:19:53 2011
Completed redo scan
 0 redo blocks read, 0 data blocks need recovery
Tue Jun 14 18:19:53 2011
Started redo application at
 Thread 1: logseq 1004, block 1124, scn 17136185
Tue Jun 14 18:19:53 2011
Recovery of Online Redo Log: Thread 1 Group 2 Seq 1004 Reading mem 0
  Mem# 0: /flashcard/oradata/G10R2/onlinelog/o1_mf_2_6v34jokt_.log
  Mem# 1: /s01/flash_recovery_area/G10R2/onlinelog/o1_mf_2_6v34jotq_.log
Tue Jun 14 18:19:53 2011
Completed redo application
Tue Jun 14 18:19:53 2011
Completed crash recovery at
 Thread 1: logseq 1004, block 1124, scn 17156186
 0 data blocks read, 0 data blocks written, 0 redo blocks read
Tue Jun 14 18:19:53 2011
LGWR: STARTING ARCH PROCESSES
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0 started with pid=16, OS id=7829
Tue Jun 14 18:19:53 2011
Thread 1 advanced to log sequence 1005 (thread open)
Thread 1 opened at log sequence 1005
  Current log# 3 seq# 1005 mem# 0: /flashcard/oradata/G10R2/onlinelog/o1_mf_3_6v34jpmp_.log
  Current log# 3 seq# 1005 mem# 1: /s01/flash_recovery_area/G10R2/onlinelog/o1_mf_3_6v34jpyn_.log
Successful open of redo thread 1
Tue Jun 14 18:19:53 2011
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
ARC0: Becoming the heartbeat ARCH
Tue Jun 14 18:19:53 2011
SMON: enabling cache recovery
Tue Jun 14 18:19:53 2011
db_recovery_file_dest_size of 204800 MB is 6.81% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Tue Jun 14 18:19:54 2011
Successfully onlined Undo Tablespace 1.
Tue Jun 14 18:19:54 2011
SMON: enabling tx recovery
Database Characterset is UTF8
Opening with internal Resource Manager plan
where NUMA PG = 1, CPUs = 2
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=17, OS id=7831
Tue Jun 14 18:19:55 2011
Completed: alter database open

注意上述单实例Crash Recovery到数据库打开的整个过程:

  • alter database open
  • Beginning crash recovery of 1 threads
  • Started redo scan
  • Completed redo scan
  • Started redo application at
  • Completed redo application
  • Completed crash recovery at
  • SMON: enabling cache recovery
  • Successfully onlined Undo Tablespace 1
  • SMON: enabling tx recovery
  • Completed: alter database open

从上述步骤中我们可以看到三种恢复名词,即:

  • crash recovery
  • cache recovery
  • tx recovery

这和官方文档所描述的Crash Recovery概念是不一致的,我们现在来理清这几种recovery。

crash recovery包含对redo的scan和application,显然其完成的是Rolling Forward前滚的工作,告警日志中出现的crash recovery等同于官方文档中介绍的”cache recovery”,我们可以将” Completed crash recovery”看做前滚完成的标志。而tx recovery从字面就可以看出实际上是Transaction Recovery,tx recovery发生在Undo Tablespace online之后(回滚事务的前提是Undo可用),数据完成打开操作之前(“Completed: alter database open”)。注意tx recovery并不要求数据库打开前完成,仅仅是在数据库打开之前由smon启动(“SMON: enabling tx recovery”)。

剩下的唯一的问题是,这里的cache recovery是什么?显然它不是官方文档中所描述的”cache recovery”,几乎没有任何文档介绍存在这样一个recovery操作,这也是本文重点要介绍的。

我们来看另一个演示,这个演示用以说明cache recovery还存在于最普通的不包含Crash Recovery的数据库打开过程中:

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.

SQL> startup mount;
ORACLE instance started.
Total System Global Area 1065353216 bytes
Fixed Size                  2089336 bytes
Variable Size             486542984 bytes
Database Buffers          570425344 bytes
Redo Buffers                6295552 bytes
Database mounted.

SQL> alter database open;
Database altered.

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmac.cn

==================alert.log====================
alter database open
Tue Jun 14 18:43:52 2011
LGWR: STARTING ARCH PROCESSES
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0 started with pid=14, OS id=8133
Tue Jun 14 18:43:52 2011
Thread 1 opened at log sequence 1005
Current log# 3 seq# 1005 mem# 0: /flashcard/oradata/G10R2/onlinelog/o1_mf_3_6v34jpmp_.log
Current log# 3 seq# 1005 mem# 1: /s01/flash_recovery_area/G10R2/onlinelog/o1_mf_3_6v34jpyn_.log
Successful open of redo thread 1
Tue Jun 14 18:43:52 2011
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
ARC0: Becoming the heartbeat ARCH
Tue Jun 14 18:43:52 2011
SMON: enabling cache recovery
Tue Jun 14 18:43:53 2011
Successfully onlined Undo Tablespace 1.
Tue Jun 14 18:43:53 2011
SMON: enabling tx recovery
Database Characterset is UTF8
Opening with internal Resource Manager plan
where NUMA PG = 1, CPUs = 2
replication_dependency_tracking turned off (no async multimaster replication found)
Tue Jun 14 18:43:53 2011
Incremental checkpoint up to RBA [0x3ed.624.0], current log tail at RBA [0x3ed.944.0]
Tue Jun 14 18:43:53 2011
Starting background process QMNC
QMNC started with pid=15, OS id=8135
Tue Jun 14 18:43:53 2011
Completed: alter database open

因为是clean shutdown,所以这里不存在crash recovery。但这里同样出现了”SMON: enabling cache recovery”,可见cache recovery是每次实例启动instance startup必要执行的一种恢复操作。但问题是,这个恢复操作到底针对何种对象?

实际上cache recovery所要恢复的是rowcache,也就是我们常说的字典缓存(dictionary cache)。关于这个结论,肯定有很多人要问我这样说的依据是什么,对应于这个”cache recovery”的问题,我们很难从google中得到一些启示,因为它和官方文档所描述的”cache recovery-rolling forward”存在重名的关系。

为了证明cache recovery所恢复的是rowcache,我们需要一个实证,从正式的系统中得到验证。要做到这一点是比较困难的,我们需要Oracle愿意把整个database open的过程变成慢动作来供我们参考,验证要用到一些调试工具,例如gdb或者dbx。

我们首先将实例启动到mount状态,并对执行startup的LOCAL进程做gdb的breakpoint断点调试:

SQL> shutdown abort;
ORACLE instance shut down.

SQL> startup mount;
ORACLE instance started.
Total System Global Area 1065353216 bytes
Fixed Size                  2089336 bytes
Variable Size             486542984 bytes
Database Buffers          570425344 bytes
Redo Buffers                6295552 bytes
Database mounted.

找出LOCAL进程的系统进程号SPID

SQL> select spid from v$process
2  where addr in (
3  select paddr from v$session
4  where sid=(select distinct sid from v$mystat))
5  /

SPID
------------
8326

在实例startup nomount/mount后共享池的library cache就是可用的

SQL> select namespace from v$librarycache where gets!=0;

NAMESPACE
---------------
SQL AREA
TABLE/PROCEDURE

而rowcache则尚未被填充,因为字典缓存来源于自举对象(bootstrap$)和字典基表

SQL> select parameter,count,gets from v$rowcache where count!=0;
no rows selected

另开一个terminal窗口,并执行对LOCAL进程8326的gdb breakpoint调试

[oracle@rh2 ~]$ gdb $ORACLE_HOME/bin/oracle 8326
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-23.el5)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from /s01/db_1/bin/oracle...(no debugging symbols found)...done.
Attaching to program: /s01/db_1/bin/oracle, process 8326
Reading symbols from /s01/db_1/lib/libskgxp10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libskgxp10.so
Reading symbols from /s01/db_1/lib/libhasgen10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libhasgen10.so
Reading symbols from /s01/db_1/lib/libskgxn2.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libskgxn2.so
Reading symbols from /s01/db_1/lib/libocr10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libocr10.so
Reading symbols from /s01/db_1/lib/libocrb10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libocrb10.so
Reading symbols from /s01/db_1/lib/libocrutl10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libocrutl10.so
Reading symbols from /s01/db_1/lib/libjox10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libjox10.so
Reading symbols from /s01/db_1/lib/libclsra10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libclsra10.so
Reading symbols from /s01/db_1/lib/libdbcfg10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libdbcfg10.so
Reading symbols from /s01/db_1/lib/libnnz10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libnnz10.so
Reading symbols from /usr/lib64/libaio.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libaio.so.1
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libnsl.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnsl.so.1
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnss_files.so.2
0x0000003181a0d8e0 in __read_nocancel () from /lib64/libpthread.so.0

输入断点kcrf_commit_force和kqlobjlod
(gdb) break kcrf_commit_force
Breakpoint 1 at 0x2724b6c

(gdb) break kqlobjlod
Breakpoint 2 at 0x1ac5e8c

在之前的terminal中执行数据库打开操作,因为breakpoint的关系这个open操作会hang住,
这时我们通过观察告警日志来了解恢复进度

SQL> alter database open;                              --这里会hang住

在gdb下输入continue,

(gdb) c
Continuing.

Breakpoint 1, 0x0000000002724b6c in kcrf_commit_force ()

观察告警日志可以发现redo application已经完成,但还未进入cache recovery阶段

alter database open
Tue Jun 14 19:14:33 2011
Beginning crash recovery of 1 threads
parallel recovery started with 2 processes
Tue Jun 14 19:14:33 2011
Started redo scan
Tue Jun 14 19:14:33 2011
Completed redo scan
39 redo blocks read, 74 data blocks need recovery
Tue Jun 14 19:14:33 2011
Started redo application at
Thread 1: logseq 1006, block 1155
Tue Jun 14 19:14:33 2011
Recovery of Online Redo Log: Thread 1 Group 1 Seq 1006 Reading mem 0
Mem# 0: /flashcard/oradata/G10R2/onlinelog/o1_mf_1_6v34jnkn_.log
Mem# 1: /s01/flash_recovery_area/G10R2/onlinelog/o1_mf_1_6v34jnst_.log
Tue Jun 14 19:14:33 2011
Completed redo application
Tue Jun 14 19:14:33 2011
Completed crash recovery at
Thread 1: logseq 1006, block 1194, scn 17200193
74 data blocks read, 74 data blocks written, 39 redo blocks read
Tue Jun 14 19:14:33 2011
LGWR: STARTING ARCH PROCESSES
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0 started with pid=17, OS id=8656
Tue Jun 14 19:14:33 2011
Thread 1 advanced to log sequence 1007 (thread open)
Thread 1 opened at log sequence 1007
Current log# 2 seq# 1007 mem# 0: /flashcard/oradata/G10R2/onlinelog/o1_mf_2_6v34jokt_.log
Current log# 2 seq# 1007 mem# 1: /s01/flash_recovery_area/G10R2/onlinelog/o1_mf_2_6v34jotq_.log
Successful open of redo thread 1
Tue Jun 14 19:14:33 2011
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
ARC0: Becoming the heartbeat ARCH
db_recovery_file_dest_size of 204800 MB is 6.81% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Tue Jun 14 19:14:37 2011
Incremental checkpoint up to RBA [0x3ef.3.0], current log tail at RBA [0x3ef.3.0]

且此时rowcache仍未被填充

SQL> select parameter,count,gets from v$rowcache where count!=0;
no rows selected

在gdb界面下再次执行continue 2次

(gdb) c
Continuing.

Breakpoint 1, 0x0000000002724b6c in kcrf_commit_force ()
(gdb) c
Continuing.

Breakpoint 2, 0x0000000001ac5e8c in kqlobjlod ()

观察告警日志可以发现已开始cache recovery,但也卡陷在cache recovery上,这保证我们的演示不受骚扰

Tue Jun 14 19:16:44 2011
SMON: enabling cache recovery

此时rowcache中出现唯一的一个dc_objects对象

select parameter,count,gets from v$rowcache where count!=0;

PARAMETER                             COUNT       GETS
-------------------------------- ---------- ----------
dc_objects                                1          1

这个对象是什么呢?也许你已经猜到了,我们做一个rowcache dump来看一下:

SQL> ALTER SESSION SET EVENTS 'immediate trace name row_cache level 10';

================row_cache trace===================

BUCKET 43170:
row cache parent object: address=0x92326060 cid=8(dc_objects)
hash=f3d1a8a1 typ=11 transaction=(nil) flags=00000001
own=0x92326130[0x9230f628,0x9230f628] wat=0x92326140[0x92326140,0x92326140] mode=S
status=EMPTY/-/-/-/-/-/-/-/-
set=0, complete=FALSE
data=
00000000 4f42000a 5453544f 24504152 00000000 00000000 00000000 00000000
00000000 00000001 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000
BUCKET 43170 total object count=1

可以看到上述dc_objects尚未完成加载(status=EMPTY & complete=FALSE ),那么这是一个什么object呢?

4f42000a 5453544f 24504152 => 转换为文本即:OB TSTO$PAR也就是BOOTSTRAP$

换而言之在cache recovery时第一个恢复的字典缓存对象是BOOTSTRAP$,这并不出乎我们的意料。

启动实例的LOCAL进程的等待事件为instance state change,这是常规情况下我们观察不到得

SQL> select event,p1text,p1 from v$session where wait_class!='Idle';

EVENT                                    P1TEXT                                           P1
---------------------------------------- ---------------------------------------- ----------
instance state change                    layer                                             2

在gdb界面下再次continue,将载入更多的rowcache对象

(gdb) c
Continuing.

Breakpoint 2, 0x0000000001ac5e8c in kqlobjlod ()

BUCKET 37:
row cache parent object: address=0x916cd980 cid=3(dc_rollback_segments)
hash=5fed2a24 typ=9 transaction=(nil) flags=000000a6
own=0x916cda50[0x916cda50,0x916cda50] wat=0x916cda60[0x916cda60,0x916cda60] mode=N
status=VALID/INSERT/-/FIXED/-/-/-/-/-
data=
00000000 00000000 00000001 00000009 59530006 4d455453 00000000 00000000
00000000 00000000 00000000 00000000 00000003 00000000 00000000 00000000
00000000 00000000 00000000 00000000
BUCKET 37 total object count=1

595300064d455453 -> SYSTEM 属于dc_rollback_segments 也就是著名的system回滚段

BUCKET 55556:
row cache parent object: address=0x916d8cd0 cid=8(dc_objects)
hash=ce89d903 typ=11 transaction=(nil) flags=00000001
own=0x916d8da0[0x9230f628,0x9230f628] wat=0x916d8db0[0x916d8db0,0x916d8db0] mode=S
status=EMPTY/-/-/-/-/-/-/-/-
set=0, complete=FALSE
data=
00000000 5f430006 234a424f 00000000 00000000 00000000 00000000 00000000
00000000 00000005 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000
BUCKET 55556 total object count=1

5f430006 234a424f -> C_OBJ# 是著名的bootstrap对象之一,可以从$ORACLE_HOME/rdbms/admin/sql.bsq中找到它

create rollback segment SYSTEM tablespace SYSTEM
storage (initial 50K next 50K)
/
create cluster c_obj# (obj# number)
pctfree 5 size 800                           /* don't waste too much space */
/* A table of 32 cols, 2 index, 2 col per index requires about 2K.
* A table of 10 cols, 2 index, 2 col per index requires about 750.
*/
storage (initial 130K next 200k maxextents unlimited pctincrease 0)
/* avoid space management during IOR I */
/

我们还可以通过v$rowcache_parent视图来了解dictionary cache的情况

SQL> col cache_name for a20
SQL> col keystr for a31
SQL> set linesize 200
SQL> select address,cache_name,existent,lock_mode,saddr,substr(key,1,30) keystr from v$rowcache_parent;

ADDRESS          CACHE_NAME           E  LOCK_MODE SADDR            KEYSTR
---------------- -------------------- - ---------- ---------------- -------------------------------
00000000916CCE20 dc_tablespaces       N          0 00               000000000000000000000000000000
00000000916CD980 dc_rollback_segments Y          0 00               000000000000000000000000000000
0000000092326060 dc_objects           Y          0 00               000000000A00424F4F545354524150
00000000916D8CD0 dc_objects           N          3 000000009BD91328 000000000600435F4F424A23000000
00000000916DA830 dc_object_ids        Y          0 00               380000000000000000000000000000

可以看到持有row cache lock的会话是'000000009BD91328',
且该dc_objects对象还处于non-existent状态,
换而言之真正装载rowcache的是启动实例的LOCAL服务进程

SQL> select sid,program,event,p1,p2,p3 from v$session where saddr='000000009BD91328';

SID PROGRAM                                          EVENT                                    P1   P2 P3
----- ------------------------------------------------ ---------------------------------------- -- ---- --
3294 sqlplus@rh2.oracle.com (TNS V1-V3)               db file scattered read                    1  378  3

该进程正在等待db file scattered read,fileid->1,block-378,这些块属于BOOTSTRAP$表

BOOTSTRAP$对象已从rowcache被载入到library cache中

SQL> select kglhdadr,kglnaobj from x$kglob where kglobtyp=2 and kglnaobj not like 'X$%';

KGLHDADR             KGLNAOBJ
-------------------- --------------------
0000000092326990     BOOTSTRAP$

SQL> select owner||'.'||Name from v$db_object_cache where type='TABLE' and name not like 'X$%';

OWNER||'.'||NAME
--------------------------------------------------------------------------------
SYS.BOOTSTRAP$

初步总结:

  1. 在数据库正式open前需要恢复字典缓存,这个步骤被称为cache recovery,其实是row cache recovery。与官方文档中描述的”cache recovery”不同,row cache recovery应当是Oracle Internal的叫法。
  2. 实际执行row cache recovery的不是SMON进程,而是启动实例的服务进程

Comments

  1. 整个过程思路很清晰

  2. 收获很大,

    有个问题想请教下,
    断点kcrf_commit_force和kqlobjlod
    怎么想到设置的啊,有相关文档参考吗?

    • admin says

      这2个internal function的出现规律是在以前探测startup启动内部机制的时候发现的,

      kcrf_commit_force — Kernel Cache Redo File Commit Force ,ksupop,kcvghbredo,krvgmrm_GenMiscRedoMarker-> kcb_sync_redo ->kcrf_commit_force

      kqlobjlod — Kernel SQL Library manager: database object load

      Possible Stack call
      (gdb) bt
      #0 0x00000000007a0624 in kslgetl ()
      #1 0x00000000022b7558 in kqreqd ()
      #2 0x00000000022b7402 in kqrprl ()
      #3 0x0000000001ac43b7 in kqllod ()
      #4 0x0000000003e88a32 in kglobld ()
      #5 0x0000000003e87b51 in kglobpn ()
      #6 0x0000000003e70105 in kglpim ()
      #7 0x0000000003e6e89d in kglpin ()
      #8 0x0000000003e6aa0c in kglgob ()
      #9 0x0000000001aeab96 in kqlbebs ()
      #10 0x0000000001aea5b8 in kqlblfc ()
      #11 0x000000000178f3d2 in adbdrv ()
      #12 0x0000000002f7ff19 in opiexe ()
      #13 0x0000000003518374 in opiosq0 ()
      #14 0x00000000030c4bcf in kpooprx ()
      #15 0x00000000030c2def in kpoal8 ()
      #16 0x000000000133a434 in opiodr ()
      #17 0x0000000003a9e6a0 in ttcpip ()
      #18 0x0000000001335b04 in opitsk ()
      #19 0x0000000001338a24 in opiino ()
      #20 0x000000000133a434 in opiodr ()
      #21 0x000000000132be53 in opidrv ()
      #22 0x0000000001eb4052 in sou2o ()
      #23 0x000000000072318b in opimai_real ()
      #24 0x00000000007230dc in main ()

  3. kevin.zhang says

    Nice Job!

  4. yyp2009 says

    测试过程很清爽,学习了!
    不过结论我还是有些疑虑的,固然gdb和类似sola下的trace等工具是能探察oracle内核每个模块中的极个别c函数的部分功能,可是往往是内核实现一个调度和去支应相应功能是,其中调动的函数过多,函数到函数跳跃很大,尽此的探察似乎很精确很多时候却是很粗旷的,往往就是一些较大功能体现的个别函数会浮出水面,而其实过程仍然可能是很复杂的!罢了不经意间说废话了。
    所以过程很好,结论不能下的太急,这类的测试不过是学习积累的巩固和拓展,至于结论还真不敢说—-当然鄙人愚见!

    比如这段文字描述的崩溃恢复和实力恢复描述:
    Two potential problems can result if an instance failure occurs:

    Data blocks modified by a transaction might not be written to the datafiles at commit time and might only appear in the redo log. Therefore, the redo log contains changes that must be reapplied to the database during recovery.
    After the roll forward phase, the datafiles may contain changes that had not been committed at the time of the failure. These uncommitted changes must be rolled back to ensure transactional consistency. These changes were either saved to the datafiles before the failure, or introduced during the roll forward phase.

    To solve this dilemma, two separate steps are generally used by Oracle for a successful recovery of a system failure: rolling forward with the redo log (cache recovery) and rolling back with the rollback segments (transaction recovery).

    The Redo Log and Rolling Forward

    The redo log is a set of operating system files that record all changes made to any database buffer, including data, index, and rollback segments, whether the changes are committed or uncommitted. Each redo entry is a group of change vectors describing a single atomic change to the database. The redo log protects changes made to database buffers in memory that have not been written to the datafiles.

    The first step of recovery from an instance or disk failure is to roll forward, or reapply all of the changes recorded in the redo log to the datafiles. Because rollback data is also recorded in the redo log, rolling forward also regenerates the corresponding rollback segments. This is called cache recovery.

    Rolling forward proceeds through as many redo log files as necessary to bring the database forward in time. Rolling forward usually includes online redo log files and may include archived redo log files.

    After roll forward, the data blocks contain all committed changes. They may also contain uncommitted changes that were either saved to the datafiles before the failure, or were recorded in the redo log and introduced during roll forward.

    Rollback Segments and Rolling Back

    Rollback segments record database actions that should be undone during certain database operations. In database recovery, rollback segments undo the effects of uncommitted transactions previously applied by the rolling forward phase.

    After the roll forward, any changes that were not committed must be undone. After redo log files have reapplied all changes made to the database, then the corresponding rollback segments are used. Rollback segments are used to identify and undo transactions that were never committed, yet were either saved to the datafiles before the failure, or were applied to the database during the roll forward. This process is called rolling back or transaction recovery.


    Crash and instance recovery involves the following two-step procedure:
    1. Roll-forward step : During this step, formally called cache recovery, the database applies the
    committed and uncommitted data in the current online redo log files to the current online
    data files.
    2. Rollback step : During this step, formally called transaction recovery , the database removes
    the uncommitted transactions applied in the previous step, using the undo data in the undo
    segments.
    As you know, when the database suddenly crashes, not all the committed transactions will have
    been written to disk. If your database is large, and the redo log files are also large, it can take a long
    time for the roll-forward and rollback to complete. By using Oracle’s Fast-Start Fault Recovery func-tionality, you can substantially reduce the downtime resulting from system-related outages.


    cache recovery

    The part of instance recovery where Oracle applies all committed and
    uncommitted changes in the redo log files to the affected data blocks.
    Also known as the rolling forward phase of instance recovery.

    The first step of recovery from an instance or disk failure is called
    cache recovery or rolling forward, and involves reapplying all of the
    changes recorded in the redo log to the datafiles. Because rollback data
    is also recorded in the redo log, rolling forward also regenerates the
    corresponding rollback segments

    Rolling forward proceeds through as many redo log files as necessary to
    bring the database forward in time. Rolling forward usually includes
    online redo log files (instance recovery or media recovery) and could
    include archived redo log files (media recovery only).

    After rolling forward, the data blocks contain all committed changes.
    They could also contain uncommitted changes that were either saved to
    the datafiles before the failure, or were recorded in the redo log and
    introduced during cache recovery

    你的测试中abort时,如果没有block修改在online redo中或者没有修改的block的活动时abort
    那么你在mount后gdb去continue时
    跟不到rolling forward involves reapplying all of the
    changes recorded in the redo log
    而跟到row cache中数据字典的load过程时,此时结论不敢说了

    我心血来潮时的个人愚见
    是一个学习过程,个人毫无其他意思,请万无误解!

  5. yyp2009 says

    个人觉得 abort 后的 由mount到open过程中row cache中数据字典object的load过程不能完全看错Crash and Instance Recovery过程。
    个人愚见,不过我还目前没有水平去更精细探察这个过程。

  6. yyp2009 says

    row cache中数据字典object的load过程不能完全看做Crash and Instance Recovery过程—刚才敲错字了

Comment

*

沪ICP备14014813号-2

沪公网安备 31010802001379号