undo

了解你所不知道的SMON功能(十二):Shrink UNDO(rollback) SEGMENT

2013/02/05 by admin Leave a Comment

SMON对于Undo(Rollback)segment的日常管理还不止于OFFLINE UNDO SEGMENT ，在AUM(automatic undo management或称SMU)模式下SMON还定期地收缩Shrink Rollback/undo segment。

触发场景

这种AUM下rollback/undo segment的undo extents被shrink的现象可能被多种条件触发：

当另一个回滚段的transaction table急需undo空间时
当SMON定期执行undo/rollback管理时(每12个小时一次):
- SMON会从空闲的undo segment中回收undo space，以便保证其他tranaction table需要空间时可用。另一个好处是undo datafile的身材不会急速膨胀导致用户要去resize
- 当处于undo space空间压力时，特别是在发生UNDO STEAL的条件下; SGA中会记录前台进程因为undo space压力而做的undo steal的次数(v$undostat UNXPSTEALCNT EXPSTEALCNT)；若这种UNDO STEAL的次数超过特定的阀值，则SMON会尝试shrink transaction table

若smon shrink rollback/undo真的发生时，会这样处理：

计算平均的undo retention大小，按照下列公式：

retention size=(undo_retention * undo_rate)/(#online_transaction_table_segment 在线回滚段的个数)

对于每一个undo segment

若是offline的undo segment，则回收其所有的已过期expired undo extents，保持最小2个extents的空间
若是online的undo segment，则回收其所有的已过期expired undo extents，但是保持其segment所占空间不小于平均retention对应的大小。

注意SMON的定期Shrink，每12个小时才发生一次，具体发生时可以参考SMON进程的TRACE。

若系统中存在大事务，则rollback/undo segment可能扩展到很大的尺寸；视乎事务的大小，则undo tablespace上的undo/rollback segment会呈现出不规则的空间占用分布。

SMON的定期清理undo/rollback segment就是要像一个大锤敲击钢铁那样，把这些大小不规则的online segment清理成大小统一的回滚段，以便今后使用。

当然这种定期的shrink也可能造成一些阻碍，毕竟在shrink过程中会将undo segment header锁住，则事务极低概率可能遇到ORA-1551错误:

[oracle@vmac1 ~]$ oerr ora 1551
01551, 00000, "extended rollback segment, pinned blocks released"
// *Cause: Doing recursive extent of rollback segment, trapped internally
//        by the system
// *Action: None

如何禁止SMON SHRINK UNDO SEGMENT?

可以通过设置诊断事件event=’10512 trace name context forever, level 1’来禁用SMON OFFLINE UNDO SEGS;

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmac.cn

SQL> alter system set events '10512 trace name context forever,level 1';

System altered.

相关BUG

这些BUG主要集中在9.2.0.8之前，10.2.0.3以后几乎绝迹了：

Bug 1955307 – SMON may self-deadlock (ORA-60) shrinking a rollback segment in SMU mode [ID 1955307.8]
Bug 3476871 : SMON ORA-60 ORA-474 ORA-601 AND DATABASE CRASHED
Bug 5902053 : SMON WAITING ON ‘UNDO SEGMENT TX SLOT’ HANGS DATABASE
Bug 6084112 : INSTANCE SLOW SHOW SEVERAL LONGTIME RUNNING WAIT EVENTS

Filed Under: Oracle, Oracle Internal Research内部原理研究 Tagged With: rollback, smon, space management, undo

DML UPDATE/DELETE与CR一致性读的秘密

2012/10/30 by admin Leave a Comment

这个问题源于OTN中文论坛的一个帖子<大事务中的更新丢失问题>：

环境为Oracle 10.2.0.4 on Linux x64
有一个大表，百万级，col1字段全为0
t1 事务A启动，把所有记录col2全更新为1
t2 事务B启动，根据主键，把一条记录更新为2，然后commit
t3 事务A执行完成，并COMMIT
t4 查询此表，发现col1全部为1，事务B的更新丢失了。
这是为什么呢，其中逻辑是怎样的
谢谢！

对于这个问题我想说明的是对于事务transaction 而言Oracle同样提供读一致性，称为transaction-level read consistency：

The database can also provide read consistency to all queries in a transaction, known as transaction-level read consistency. In this case, each statement in a transaction sees data from the same point in time, which is the time at which the transaction began.

Oracle Database Concepts
11g Release 2 (11.2)
Part Number E10713-04

为了证明和演示该事务级别的读一致性，我设计了以下演示：

有一张2个列的大表(T1,T2)，其中分别有 T1=600000，T2=’U2′ 和 T1=900000和 T2=’U1’的 2行数据，T1为主键。

在A)时刻，Session A使用SELECT .. FOR UPDATE锁住T1=600000的这一行

在之后的B)时刻，Session B尝试update TAB set t2=’U3′ where t2=’U2′ 即更新所有T2=’U2’的数据行，但是因为 T1=600000，T2=’U2’这一行正好被Session A锁住了，所以Session B会一直等待’enq: TX – row lock contention’；T1=900000和 T2=’U1’的数据行位于Session B处理 T1=600000，T2=’U2’行等待之后才能处理到的数据块中。

在之后的C)时刻，Session C更新update TAB set t2=’U2′ where t1=900000;并commit，即将T1=900000和 T2=’U1’更新为 T1=900000和 T2=’U2’，这样就符合Session B 更新Update的条件t2=’U2’了。

在D)时刻， Session A执行commit释放锁，Session B得以继续工作，当他处理到T1=900000的记录时存在以下分歧：

1)若update DML满足transaction-level read consistency，则它应当看到的是session B事务开始环境SCN(env SCN)时的块的前镜像，即虽然session C更新了t2=’U2’满足其条件，但是为了一致性，session B仍需要对该行所在数据块做APPLY UNDO，直到满足该session B事务开始时间点的Best CR块，而CR一致镜像中t2=’U1’，不满足Session B的更新条件，那么session B在整个事务中仅更新一行数据 T1=600000，T2=’U2’，session B only Update One Rows。

2) 若update DML不满足transaction-level read consistency，则session B看到的是当前read commited的镜像,即是Session C已更新并提交的块镜像，此时的记录为T1=900000和 T2=’U2’符合session B的更新条件，则session B要更新2行数据。

我们来看一下实际实验的结果：

构建实验环境：

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

create table update_cr tablespace users as select rownum t1, 'MACLEAN     ' T2 
from dual connect by level  update update_cr set t2='U2' where t1=600000;

1 row updated.

SQL> update update_cr set t2='U1' where t1=900000;

1 row updated.

SQL> commit;

Commit complete.

SQL> select * from update_cr where t1=600000 or t1=900000;

        T1 T2
---------- ------------
    600000 U2
    900000 U1

在A)时刻，Session A使用SELECT .. FOR UPDATE锁住T1=600000的这一行:

session A:

SQL> select * from update_cr where t1=600000 for update;

        T1 T2
---------- ------------
    600000 U2

session B:

SQL> alter system flush buffer_cache;

System altered.

SQL> /

System altered.

SQL> alter session set events '10046 trace name context forever,level 8 : 10201 trace name context forever,level 10';

Session altered.

SQL> update update_cr set t2='U3' where t2='U2';

session C:

SQL> select * from update_cr where t1=900000;

        T1 T2
---------- ------------
    900000 U1

SQL> update update_cr set t2='U2' where t1=900000;

1 row updated.

SQL> commit;

Commit complete.

SQL> alter system flush buffer_cache;

System altered.

之后session A 执行 commit;session B得以继续update，得到实验的结果：

session A 执行 commit;

session B;

SQL> update update_cr set t2='U3' where t2='U2';

1 row updated.

以看到以上实验的结果 update仅更新了一行数据，证明了观点1″若update DML满足transaction-level read consistency，则它应当看到的是session B事务开始环境SCN(env SCN)时的块的前镜像，即虽然session C更新了t2=’U2’满足其条件，但是为了一致性，session B仍需要对该行所在数据块做APPLY UNDO，直到满足该session B事务开始时间点的Best CR块，而CR一致镜像中t2=’U1’，不满足Session B的更新条件，那么session B在整个事务中仅更新一行数据 T1=600000，T2=’U2’，session B only Update One Rows。”的正确性。

即update/delete之类的DML在Oracle中满足transaction-level read consistency，保证其所”看到的”是满足事务开始时间点读一致性的Consistent Read，这也是为什么DML会产生Consistent Read的原因之一。

我们回过头来看一下上面Session B产生的10201 undo apply和10046 trace的内容，可以发现更多内容：

SQL> select dbms_rowid.rowid_block_number(rowid),dbms_rowid.rowid_relative_fno(rowid) from update_cr where t1=600000 or t1=900000;

DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID)
------------------------------------ ------------------------------------
                                2589                                    4
                                3558                                    4

WAIT #139924171154784: nam='db file sequential read' ela= 19504 file#=3 block#=128 blocks=1 obj#=0 tim=1258427129863987
Applying CR undo to block 4 : 1000a1d itl entry 03:
xid: 0x0001.010.00000277 uba: 0x00c0300e.007c.1e
flg: ---- lkc: 1 fsc: 0x0000.00000000
WAIT #139924171154784: nam='db file sequential read' ela= 12957 file#=3 block#=12302 blocks=1 obj#=0 tim=1258427129877291
CRS upd rd env [0x7f42a284a704]: (scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000.00000000 hi-scn: 0x0000.00000000 ma-scn: 0x0000.00223e78 flg: 0x00000060) undo env [0x7fffe3fe7f60]: (
scn: 0x0000.00223eb9 xid: 0x0001.010.00000277 uba: 0x00c0300e.007c.1e statement num=2 parent xid: 0x0000.000.00000000 st-scn: 0xe404.00000000 hi-scn: 0x2100.00000000 ma-scn: 0x0000.00000000 flg: 0x62515e33)
CRS upd (before): 0xa5fe9198 cr-scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 cl-scn: 0x0000.00223eba sfl: 0
CRS upd (after) : 0xa5fe9198 cr-scn: 0x0000.00223eb9 xid: 0x0001.010.00000277 uba: 0x00c0300e.007c.1e cl-scn: 0x0000.00223eba sfl: 0

*** 2009-11-16 22:06:01.253
WAIT #139924171154784: nam='enq: TX - row lock contention' ela= 31358812 name|mode=1415053318 usn<<16 | slot=65552 sequence=631 obj#=76969 tim=1258427161253791
WAIT #139924171154784: nam='db file sequential read' ela= 36051 file#=4 block#=2589 blocks=1 obj#=76969 tim=1258427161290180
WAIT #139924171154784: nam='db file scattered read' ela= 14718 file#=4 block#=2590 blocks=98 obj#=76969 tim=1258427161305684
WAIT #139924171154784: nam='db file scattered read' ela= 26432 file#=4 block#=2690 blocks=126 obj#=76969 tim=1258427161338364
WAIT #139924171154784: nam='db file scattered read' ela= 38289 file#=4 block#=2818 blocks=126 obj#=76969 tim=1258427161384445
WAIT #139924171154784: nam='db file scattered read' ela= 24265 file#=4 block#=2946 blocks=126 obj#=76969 tim=1258427161416302
WAIT #139924171154784: nam='db file scattered read' ela= 21288 file#=4 block#=3074 blocks=126 obj#=76969 tim=1258427161444684
WAIT #139924171154784: nam='db file scattered read' ela= 23840 file#=4 block#=3202 blocks=126 obj#=76969 tim=1258427161476063
WAIT #139924171154784: nam='db file scattered read' ela= 27439 file#=4 block#=3330 blocks=126 obj#=76969 tim=1258427161511429
WAIT #139924171154784: nam='db file scattered read' ela= 24424 file#=4 block#=3458 blocks=126 obj#=76969 tim=1258427161543429
Applying CR undo to block 4 : 1000de6 itl entry 03:
xid: 0x0003.01f.00000345 uba: 0x00c03092.008f.34
flg: --U- lkc: 1 fsc: 0x0000.00223ef9
WAIT #139924171154784: nam='db file sequential read' ela= 8033 file#=3 block#=12434 blocks=1 obj#=0 tim=1258427161557534
CRS upd rd env [0x7f42a284a704]: (scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000.00000000 hi-scn: 0x0000.00000000 ma-scn: 0x0000.00223e78 flg: 0x00000060) undo env [0x7fffe3fe7f60]: (
WAIT #139924171154784: nam='db file sequential read' ela= 8033 file#=3 block#=12434 blocks=1 obj#=0 tim=1258427161557534
CRS upd rd env [0x7f42a284a704]: (scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000.00000000 hi-scn: 0x0000.00000000 ma-scn: 0x0000.00223e78 flg: 0x00000060) undo env [0x7fffe3fe7f60]: (
scn: 0x0000.00223ef8 xid: 0x0003.01f.00000345 uba: 0x00c03092.008f.34 statement num=8192 parent xid: 0x0000.000.0baf3fa0 st-scn: 0x3fa0.00000000 hi-scn: 0x0000.00000000 ma-scn: 0x0000.00000000 flg: 0x00000000)
CRS upd (before): 0xa4fdaa08 cr-scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 cl-scn: 0x0000.00223eff sfl: 0
CRS upd (after) : 0xa4fdaa08 cr-scn: 0x0000.00223eb9 xid: 0x0003.01f.00000345 uba: 0x00c03092.008f.34 cl-scn: 0x0000.00223eff sfl: 0
WAIT #139924171154784: nam='db file scattered read' ela= 13029 file#=4 block#=3586 blocks=126 obj#=76969 tim=1258427161572511
WAIT #139924171154784: nam='db file scattered read' ela= 22282 file#=4 block#=3714 blocks=126 obj#=76969 tim=1258427161602324
WAIT #139924171154784: nam='db file sequential read' ela= 6127 file#=4 block#=522 blocks=1 obj#=76969 tim=1258427161615461
WAIT #139924171154784: nam='db file scattered read' ela= 13503 file#=4 block#=3842 blocks=42 obj#=76969 tim=1258427161629323

以上可以看到对T1=600000所在的数据块1000a1d=》datafile 4 259 apply了UNDO，其环境SCN 为scn: 0x0000.00223eb9:

Applying CR undo to block 4 : 1000a1d itl entry 03:
xid: 0x0001.010.00000277 uba: 0x00c0300e.007c.1e
flg: —- lkc: 1 fsc: 0x0000.00000000
WAIT #139924171154784: nam=’db file sequential read’ ela= 12957 file#=3 block#=12302 blocks=1 obj#=0 tim=1258427129877291
CRS upd rd env [0x7f42a284a704]: (scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000.00000000 hi-scn: 0x0000.00000000 ma-scn: 0x0000.00223e78 flg: 0x00000060) undo env [0x7fffe3fe7f60]: (
scn: 0x0000.00223eb9 xid: 0x0001.010.00000277 uba: 0x00c0300e.007c.1e statement num=2 parent xid: 0x0000.000.00000000 st-scn: 0xe404.00000000 hi-scn: 0x2100.00000000 ma-scn: 0x0000.00000000 flg: 0x62515e33)
CRS upd (before): 0xa5fe9198 cr-scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 cl-scn: 0x0000.00223eba sfl: 0
CRS upd (after) : 0xa5fe9198 cr-scn: 0x0000.00223eb9 xid: 0x0001.010.00000277 uba: 0x00c0300e.007c.1e cl-scn: 0x0000.00223eba sfl: 0

对于T1=900000的数据行所在块1000de6=》 datafile 4 3558 block同样apply了UNDO，其环境SCN均为00223eb9

Applying CR undo to block 4 : 1000de6 itl entry 03:
xid: 0x0003.01f.00000345 uba: 0x00c03092.008f.34
flg: –U- lkc: 1 fsc: 0x0000.00223ef9
WAIT #139924171154784: nam=’db file sequential read’ ela= 8033 file#=3 block#=12434 blocks=1 obj#=0 tim=1258427161557534
CRS upd rd env [0x7f42a284a704]: (scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000.00000000 hi-scn: 0x0000.00000000 ma-scn: 0x0000.00223e78 flg: 0x00000060) undo env [0x7fffe3fe7f60]: (
scn: 0x0000.00223ef8 xid: 0x0003.01f.00000345 uba: 0x00c03092.008f.34 statement num=8192 parent xid: 0x0000.000.0baf3fa0 st-scn: 0x3fa0.00000000 hi-scn: 0x0000.00000000 ma-scn: 0x0000.00000000 flg: 0x00000000)
CRS upd (before): 0xa4fdaa08 cr-scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 cl-scn: 0x0000.00223eff sfl: 0
CRS upd (after) : 0xa4fdaa08 cr-scn: 0x0000.00223eb9 xid: 0x0003.01f.00000345 uba: 0x00c03092.008f.34 cl-scn: 0x0000.00223eff sfl: 0

Filed Under: Oracle, Oracle Internal Research内部原理研究 Tagged With: CR, undo

在Oracle中如何让SELECT查询绕过UNDO

2012/08/06 by admin Leave a Comment

是否有想过如何在Oracle中实现脏读(dirty read)，在Oracle官方文档或者Asktom的时候显然会提到Oracle是不实现脏读的，总是有undo来提供数据块的前镜像(before image)以维护一致性Consistent，通过正常途径我们几乎不可能破坏Oracle中查询的一致性来实现脏读。

如果自己搞不定可以找诗檀软件专业ORACLE数据库修复团队成员帮您恢复!

诗檀软件专业数据库修复团队

服务热线： 13764045638 QQ号:47079569 邮箱：service@parnassusdata.com

是否真的无计可施？非也，非也，Oracle作为一个高度复杂而又可控的RDBMS，仍是有很多空子可以钻的。

我们先来介绍以下的2个隐藏参数：

_offline_rollback_segments or _corrupted_rollback_segments 这2个隐藏参数对于熟悉Oracle数据库异常恢复或者解决ORA-600[4XXX]问题有经验的同学来说肯定不陌生，因为这2个参数是针对Undo存在Corruption讹误时忽略问题的有力工具，而大家对这2个参数实际起到的作用有多少认识呢？

我们可能在以下几种场景中用到_offline_rollback_segments 和 _corrupted_rollback_segments 这2个隐藏参数：

强制打开数据库(FORCE OPEN DATABASE)
控制一致性读和延迟块清除(consistent read & delayed block cleanout)
强制删除某个rollback segment回滚段

但是请注意：千万不要在没有Oracle专家建议的情况下，在产品环境设置以上这2个隐藏参数，可能造成数据逻辑讹误等影响！！

_offline_rollback_segments 和 _corrupted_rollback_segments 均会造成的实例行为变化：

以上2个参数所列出的Undo Segments(撤销段/回滚段)将不会被在线使用online
在UNDO$数据字典基表中将体现为OFFLINE的记录
在实例instance的生命周期中将不会再给新的事务分配使用
参数所列出的Undo Segments列表上的活跃事务active transaction将即不被回滚亦不被标记为dead以便SMON去回滚(了解你所不知道的SMON功能(五):Recover Dead transaction)

_OFFLINE_ROLLBACK_SEGMENTS(offline undo segment list)隐藏参数(hidden parameter)的独有作用：

在实例startup启动并open database的阶段仍将读取_OFFLINE_ROLLBACK_SEGMENTS所列出的Undo segments(撤销段/回滚段)，若访问这些undo segments出现了问题则将在alert.log和其他TRACE中体现出来，但不影响实际的startup进程
- 若查询数据块发现活跃的事务，并ITL指向对应的undo segments则：
- 若读取undo segments的transaction table事务表发现事务已提交则做数据块的清除
- 若读取发现事务仍活动未commit，则生成一个CR块拷贝
- 若读取该undo segments存在问题(可能是corrupted讹误，可能是missed丢失)则产生一个错误并写出到alert.log，查询将异常终止
若DML更新相关的数据块会导致服务进程为了恢复活跃事务而进入死循环消耗大量CPU，解决方法是通过可以进行的查询工作重建相关表

_CORRUPTED_ROLLBACK_SEGMENTS(corrupted undo segment list)隐藏参数所独有的功能:

在实例启动startup并open database的阶段_CORRUPTED_ROLLBACK_SEGMENTS所列出的undo segments(撤销段/回滚段)将不会被访问读取
所有指向这些被_CORRUPTED_ROLLBACK_SEGMENTS列出的undo segments的事务都被认为已经提交了commit，和这个undo segments已经被drop时类似
- 这将导致严重的逻辑讹误
- 如果数据字典上有活跃事务那么将更糟糕，数据字典逻辑讹误会造成数据库管理问题
- 如果bootstrap自举核心对象有活跃事务，那么将无法忽略错误ORA-00704: bootstrap process failure错误，导致无法强制打开数据库(见拙作Oracle数据恢复:解决ORA-00600:[4000] ORA-00704: bootstrap process failure错误一例)
衷心地建议用_CORRUPTED_ROLLBACK_SEGMENTS这个参数打开数据库后导出数据并重建数据库，这个参数使用的后遗症可能很顽固
Oracle公司内部有叫做TXChecker的工具可以检查问题事务

好了了解了以上2个参数之后，你将不难明白下面演示如何通过_CORRUPTED_ROLLBACK_SEGMENTS隐藏参数让SELECT查询绕过UNDO实现脏读的过程：

SQL> alter system set event= '10513 trace name context forever, level 2' scope=spfile;

System altered.

SQL> alter system set "_in_memory_undo"=false scope=spfile;

System altered.

10513 level 2 event可以禁止SMON 回滚rollback 死事务 dead transaction
_in_memory_undo 禁用 in memory undo 特性

SQL> startup force;
ORACLE instance started.

Total System Global Area 3140026368 bytes
Fixed Size                  2232472 bytes
Variable Size            1795166056 bytes
Database Buffers         1325400064 bytes
Redo Buffers               17227776 bytes
Database mounted.
Database opened.

session A:

SQL> conn maclean/maclean
Connected.

SQL> create table maclean tablespace users as select 1 t1 from dual connect by level exec dbms_stats.gather_table_stats('','MACLEAN');

PL/SQL procedure successfully completed.

SQL> set autotrace on;   

SQL> select sum(t1) from maclean;

   SUM(T1)
----------
       501

Execution Plan
----------------------------------------------------------
Plan hash value: 1679547536

------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |     1 |     3 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |         |     1 |     3 |            |          |
|   2 |   TABLE ACCESS FULL| MACLEAN |   501 |  1503 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        515  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processe

在没有活跃事务的情况下，直接读取current block，
全表扫描一致性读，consistent gets只要3次                  

SQL> update maclean set t1=0;

501 rows updated.

SQL> alter system checkpoint;

System altered.

这里session A不commit;

另开一个 session:

SQL> conn maclean/maclean
Connected.
SQL> 
SQL> set autotrace on;
SQL>  select sum(t1) from maclean;

   SUM(T1)
----------
       501

Execution Plan
----------------------------------------------------------
Plan hash value: 1679547536

------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |     1 |     3 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |         |     1 |     3 |            |          |
|   2 |   TABLE ACCESS FULL| MACLEAN |   501 |  1503 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        505  consistent gets
          0  physical reads
        108  redo size
        515  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

为了一致性读 上面的查询需要通过undo构造CR块，这导致consistent gets上升到 505

[oracle@vrh8 ~]$ ps -ef|grep LOCAL=YES |grep -v grep
oracle    5841  5839  0 09:17 ?        00:00:00 oracleG10R25 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

[oracle@vrh8 ~]$ kill -9 5841

杀掉session A对应的Server Process服务进程，这导致dead transaction 但是不被smon回滚

     select ktuxeusn,
               to_char(sysdate, 'DD-MON-YYYY HH24:MI:SS') "Time",
               ktuxesiz,
               ktuxesta
          from x$ktuxe
         where ktuxecfl = 'DEAD';

  KTUXEUSN Time                   KTUXESIZ KTUXESTA
---------- -------------------- ---------- ----------------
         2 06-AUG-2012 09:20:45          7 ACTIVE

此时有1个active rollback segment 

SQL> conn maclean/maclean
Connected.

SQL> set autotrace on;

SQL> select sum(t1) from maclean;

   SUM(T1)
----------
       501

Execution Plan
----------------------------------------------------------
Plan hash value: 1679547536

------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |     1 |     3 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |         |     1 |     3 |            |          |
|   2 |   TABLE ACCESS FULL| MACLEAN |   501 |  1503 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        411  consistent gets
          0  physical reads
        108  redo size
        515  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

到上面为止 虽然通过kill进程 和禁止smon 回滚dead transaction ，
形成了一个不回滚的死事务 但是仍通过undo实现了一致性读

找出当前active的rollback segment的名字

SQL> select segment_name from dba_rollback_segs where segment_id=2;

SEGMENT_NAME
------------------------------
_SYSSMU2$

SQL> alter system set "_corrupted_rollback_segments"='_SYSSMU2$'  scope=spfile;

System altered.

用 _corrupted_rollback_segments 废掉 上面的2个rollback segment， 这将导致无法提供undo 


SQL> startup force;
ORACLE instance started.

Total System Global Area 3140026368 bytes
Fixed Size                  2232472 bytes
Variable Size            1795166056 bytes
Database Buffers         1325400064 bytes
Redo Buffers               17227776 bytes
Database mounted.
Database opened.

SQL> conn maclean/maclean
Connected.

SQL> set autotrace on;

SQL> select sum(t1) from maclean;

   SUM(T1)
----------
        94

Execution Plan
----------------------------------------------------------
Plan hash value: 1679547536

------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |     1 |     3 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |         |     1 |     3 |            |          |
|   2 |   TABLE ACCESS FULL| MACLEAN |   501 |  1503 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
        228  recursive calls
          0  db block gets
         29  consistent gets
          5  physical reads
        116  redo size
        514  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          4  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> /

   SUM(T1)
----------
        94

Execution Plan
----------------------------------------------------------
Plan hash value: 1679547536

------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |     1 |     3 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |         |     1 |     3 |            |          |
|   2 |   TABLE ACCESS FULL| MACLEAN |   501 |  1503 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        514  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

以上可以看到 consistent gets下降到3，服务进程读取数据块发现存在活跃事务，但是ITL指向的UNDO SEGMENTS在_corrupted_rollback_segments的列表中，所以直接认为该事务已经COMMIT提交，以便绕过UNDO。

这里实现了脏读，虽然通过上述方法去实现脏读在产品环境中没有实际收益(有部分数据库软件允许脏读来做统计信息收集)，破坏了一致性且需要设置需重启实例的隐藏参数，仅供参考。

Filed Under: Oracle, Oracle Internal Research内部原理研究 Tagged With: _CORRUPTED_ROLLBACK_SEGMENTS, _OFFLINE_ROLLBACK_SEGMENTS, dirty read, undo

【数据恢复】解决ORA-600[4xxx]错误并打开数据库

2012/02/21 by admin 1 Comment

如果自己搞不定可以找诗檀软件专业ORACLE数据库修复团队成员帮您恢复!

诗檀软件专业数据库修复团队

服务热线： 13764045638 QQ号:47079569 邮箱：service@parnassusdata.com

Recovering a Dropped Table from a Full Database Backup using manual backup and recovery procedures (Doc ID 96197.1)

You cannot follow the note exactly as this note is for a valid full backup where you have all logs needed for point in time recovery – the changes you need to make are in the summary steps above.
Once you have done the recovery we can attempt a force open (you will have to do the restore/recovery again for sys, sysaux and undo because you have done an open resetlogs on the restored database):

YOU ARE ADVISED TO TAKE A FULL COLD OS BACKUP OF THE DATABASE BEFORE PROCEEDING WITH THE FORCE OPEN.

To force open a database:

1. if you are using AUM you need to identify ALL rollback segments listed in your dictionary (system dbfs):

On unix, you can use the unix strings command to extract the rollback segments:

$ strings system01.dbf | grep _SYSSMU | cut -d $ -f 1 | sort -u > listSMU

where system01.dbf is the name of the datafile for the SYSTEM tablespace – if you have > 1 system dbf then you need to do this for each file, writing to a different listSMUn file each time.

2. Mount the database and create a pfile (if you dont already have one)

SQL>create pfile=’init<sid>.ora’ from spfile;

Edit the pfile – add :

_allow_resetlogs_corruption=true
undo_management=MANUAL
_corrupted_rollback_segments=<list>

Where <list> is the list of rollback segment names extracted as above with ‘$’ appended to each name eg

_corrupted_rollback_segments=(_SYSSMU1$, _SYSSMU2$………_SYSSMU10$)

3. Remount the instance using the amended pfile and then do the following:

SQL>recover database using backup controlfile until cancel;
SQL>cancel
SQL>alter database open resetlogs;

If this works then you can EXPORT the database contents – the database will have to be completely rebuilt and the data re-imported.
IF this fails then I need to see the failure (ORA-1555, ORA-600[2662], ORA-600[4194], Ora-600[4000] are common errors raised during force open), along with the alert log extract showing the start up after setting these hidden parameters. Please remember to REMOVE the hidden parameters after you have exported – the should never be used without advice from Oracle Support.

Filed Under: Oracle, Oracle数据恢复、数据库恢复、灾难恢复 Tagged With: _CORRUPTED_ROLLBACK_SEGMENTS, AUM, undo

Oracle Internal Event:10201 consistent read undo application诊断事件

2011/10/11 by admin Leave a Comment

之前我介绍了<Oracle Internal Event:10200 Consistent Read诊断事件>一致性逻辑读诊断事件的用法和trace含义，10201 “consistent read undo application”是另一个十分有用的内部诊断事件，该事件可以用于诊断一致性读取时的UNDO应用情况。

10201 event可以用于探测为了创建CR(consistent read) block块以满足要求的SCN需要应用多少undo，该10201 event还可以配和10200 event使用。利用该10201 event，我们可以验证一些内部问题，例如何时会发生块上的cleanup。

注意启用10201 event可能导致在短期内产生大量的trace文件，所以不要随意在生产系统中使用。
10201 Internal Event主要会被ktrgcm( CR-rollback (ktrgcm() ) 、 ktrrbkblk 、 ktrcrf 这三个Oracle内核函数触发，这三个Internal Function的主要作用：

ktrgcm – common CR read code
CR Requestor-Side Algorithm
The following statistics are incremented by ktrgcm:
“cleanouts and rollbacks – consistent read” is incremented if UNDO is applied to BUFFER and CLEANOUT is performed.
“rollbacks only – consistent read gets” is incremented if UNDO is applied to BUFFER and no CLEANOUT is performed.
“cleanouts only – consistent read gets” is incremented if no UNDO is applied and CLEANOUT is performed.
“no work – consistent read gets” is incremented if no UNDO is applied and no CLEANOUT is performed.
When UNDO is applied to produce a CR BUFFER, other UNDO blocks should be read.
When CLEANOUT is performed, the TX transaction table must be read.

ktrrbkblk retrieves previous row version with ktundo,When all rows checked, calls ktrrbkblk to rollback block (calls ktundo) 常见的stack call : ktrviupk kdiulk kcoubk ktundo kturbk ktrrbkblk ktrvfxs qerixFetch qertbFetchByRowID

ktrcrf (rdbms/kernel/knl/ktr.c kcbchg1 ==> ktrcrf)

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmac.cn & www.askmac.cn

SQL> create table maclean (t1 int);

Table created.

SQL> insert into maclean values(1);

1 row created.

SQL> commit;

Commit complete.

SQL> select current_scn from v$database;

CURRENT_SCN
-----------
    1213588

SQL> delete maclean;

1 row deleted.

SQL> commit;

Commit complete.

SQL> alter system flush buffer_cache;

System altered.

SQL> alter system flush buffer_cache;

System altered.

SQL>  alter session set events '10201 trace name context forever,level 10';

Session altered.

SQL> select * from maclean as of scn 1213588;

        T1
----------
         1

trace content=====================================

Applying CR undo to block 0 : 408a81 itl entry 02:
          xid:  0x0009.00b.0000017d uba: 0x00c00212.0092.2d
          flg: --U-    lkc:  1     fsc: 0x0007.0012849c
CRS upd rd env [0x2ac2ec7b5660]: (scn: 0x0000.00128494  xid: 0x0000.000.00000000
uba: 0x00000000.0000.00  statement num=0  parent xid: 0x0000.000.00000000  st-scn: 0x
0000.00000000  hi-scn: 0x0000.00000000  ma-scn: 0x0000.00000000  flg: 0x00000800)
undo env [0x7fffe8b74d10]: (
scn: 0x0000.0012849b  xid: 0x0009.00b.0000017d  uba: 0x00c00212.0092.2d  statement num=151548811
parent xid: 0x0000.000.00000000  st-scn: 0x0000.00000000  hi-scn: 0x9
f58.00000000  ma-scn: 0x2ac2.ec7b9c88  flg: 0x00002ac2)
CRS upd (before): 0x69bdfe68  cr-scn: 0x0000.00128494  xid: 0x0000.000.00000000
uba: 0x00000000.0000.00  cl-scn: 0x0000.001287c1  sfl: 0
CRS upd (after) : 0x69bdfe68  cr-scn: 0x0000.00128494  xid: 0x0009.00b.0000017d
uba: 0x00c00212.0092.2d  cl-scn: 0x0000.001287c1  sfl: 0

以上trace中各代码的含义如下：

Applying CR undo to block 0 : 408a81 itl entry 02:
这里的0是 tablespace number, 408a81 是 DBA
而 itl entry 02 是被回滚的事务槽记录

CRS upd rd env [0x2ac2ec7b5660]: (scn: 0x0000.00128494 …..undo env [0x7fffe8b74d10]: (
以上为当前读取的环境信息，包括env_scn等

CRS upd (before): 0x69bdfe68 cr-scn: 0x0000.00128494
xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 cl-scn: 0x0000.001287c1 sfl: 0
CRS upd (before)为回滚完成前的Buffer descriptor

CRS upd (after) : 0x69bdfe68 cr-scn: 0x0000.00128494 CR-SCN为1213588 如查询语句所要求的
xid: 0x0009.00b.0000017d uba: 0x00c00212.0092.2d cl-scn: 0x0000.001287c1 sfl: 0

CRS upd (after) 为回滚完成后的Buffer descriptor

Filed Under: Oracle, Oracle Internal Research内部原理研究 Tagged With: consistent read, CR, Oracle Internal Event, undo

Automatic Tuning of Undo Retention

2011/08/08 by admin Leave a Comment

Oracle 10gR2 and higher automatically tunes the undo retention period based on how the undo tablespace is configured.

If the undo tablespace is configured with the AUTOEXTEND option, the database dynamically tunes the undo retention period to be somewhat longer than the longest-running active query on the system. However, this retention period may be insufficient to accommodate Oracle Flashback operations. Oracle Flashback operations resulting in “snapshot too old” errors are the indicator that you must intervene to ensure that sufficient undo data is retained to support these operations. To better accommodate Oracle Flashback features, you can either set the UNDO_RETENTION parameter to a value equal to the longest expected Oracle Flashback operation, or you can change the undo tablespace to fixed size.
If the undo tablespace is fixed size, the database dynamically tunes the undo retention period for the best possible retention for that tablespace size and the current system load. This best possible retention time is typically significantly greater than the duration of the longest-running active query.
To guarantee the success of long-running queries or Oracle Flashback operations, you can enable retention guarantee. If retention guarantee is enabled, the specified minimum undo retention is guaranteed; the database never overwrites unexpired undo data even if it means that transactions fail due to lack of space in the undo tablespace. If retention guarantee is not enabled, the database can overwrite unexpired undo when space is low, thus lowering the undo retention for the system. This option is disabled by default.

Filed Under: Oracle, Oracle Internal Research内部原理研究 Tagged With: undo

Know more about commit

2011/07/13 by admin Leave a Comment

COMMIT操作是RDBMS中事务结束的标志，在Oracle中与commit紧密相关的是SCN(System Change Number)。

引入SCN的最根本目的在于：

为读一致性所用
为redolog中的记录排序，以及恢复

SCN由SCN Base和Scn Wrap组成，是一种6个字节的结构(structure)。其中SCN Base占用4个字节，而SCN wrap占用2个字节。但在实际存储时SCN-like的stucture常会占用8个字节。

 ub4 kscnbas
 ub2 kscnwrp

struct kcvfhcrs, 8 bytes                 @100                              Creation Checkpointed at scn
      ub4 kscnbas                        @100      0x000a8849
      ub2 kscnwrp                        @104      0x0000

在Oracle中一个事务的开始包含以下操作:

绑定一个可用的rollback segment
在事务表(transaction table)上分配一个必要的槽位
从rollback segment中分配undo block

注意system rollback segment是一种特殊的回滚段，在10g以后普通回滚段的类型都变成了”TYPE2 UNDO”，而唯有system rollback segment的类型仍为”ROLLBACK”，这是由其特殊性造就的:

SQL> col segment_name for a20
SQL> col rollback for a20
SQL> select segment_name,segment_type from dba_segments where segment_type='ROLLBACK';

SEGMENT_NAME         SEGMENT_TYPE
-------------------- ------------------
SYSTEM               ROLLBACK

System rollback segment面向的是SYSTEM表空间上数据字典对象相关事务的数据，以及由对用户数据产生的递归SQL调用所产生的数据。

Oracle不使用基于内存锁管理器的行锁，Oracle中的row lock是基于数据块的。数据块中的Interested Transaction List(ITL)是行锁的重要标志。
ITL的分配遵循以下的原则:

找出未被使用的ITL
找出最老的已经事务提交的ITL
做部分的块清理，直到有可用的ITL
扩展ITL区域，一条ITL占用24字节

当事务提交COMMIT时，需要完成以下步骤的操作:

得到一个SCN值
使用得到的SCN更新事务表中的槽位
在redo log buffer中创建一条commit记录
将redo log buffer刷新到磁盘上的在线日志文件
释放表和行上的锁(may cause delayed block cleanout)

Filed Under: Oracle, Oracle Internal Research内部原理研究 Tagged With: commit, ITL, Oracle Lock锁机制, redo, rollback, transaction, undo

Oracle内部错误ORA-00600:[2667]一例

2011/04/06 by admin Leave a Comment

一套Power AIX上的9.2.0.1系统在数据库打开过程中遇到ORA-00600:[2667]内部错误，详细日志如下:

如果自己搞不定可以找诗檀软件专业ORACLE数据库修复团队成员帮您恢复!

诗檀软件专业数据库修复团队

服务热线： 13764045638 QQ号:47079569 邮箱：service@parnassusdata.com

Wed Mar 9 19:03:38 2011
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 117699122479
Resetting resetlogs activation ID 2197911857 (0x83017931)
Wed Mar 9 19:03:47 2011
LGWR: Primary database is in CLUSTER CONSISTENT mode
Assigning activation ID 2284878888 (0x88307c28)
Thread 1 opened at log sequence 1
Current log# 3 seq# 1 mem# 0: /s01/maclean/oradata/PROD/redo03.log
Successful open of redo thread 1.
Wed Mar 9 19:03:47 2011
SMON: enabling cache recovery
Wed Mar 9 19:03:47 2011
Errors in file /s01/maclean/admin/PROD/udump/PROD_ora_585914.trc:
ORA-07445: exception encountered: core dump [] [] [] [] [] []
Wed Mar 9 19:03:48 2011
Errors in file /s01/maclean/admin/PROD/bdump/PROD_pmon_602286.trc:
ORA-07445: exception encountered: core dump [] [] [] [] [] []
Wed Mar 9 19:03:50 2011
Errors in file /s01/maclean/admin/PROD/bdump/PROD_lgwr_548884.trc:
ORA-00600: internal error code, arguments: [2667], [1], [1], [3], [1739273391], [1739273391], [1739273391], [7267]
LGWR: terminating instance due to error 600

相关的初始化参数
fast_start_mttr_target = 300
_allow_resetlogs_corruption= TRUE
undo_management = AUTO
undo_tablespace = UNDOTBS1

以上可以看到lgwr关键进程在数据库open后几秒后遭遇了ORA-00600:[2667]内部错误后终止了实例。
该数据库在之前因为丢失当前日志文件进行了已经实施了一系列的非常规恢复操作，包括设置一系列的underscore参数:

Before I provide the steps to fix the ora-00600 error, I want to tell you that this database is opened with the unsupported parameter
"allow_resetlogs_corruption".

*************************************************************************
* By forcing open the database using this parameter, there is a strong *
* likelihood of logical corruption, possibly affecting the data *
* dictionary. Oracle does not guarantee that all of the data will be *
* accessible nor will it support a database that has been opened by *
* this method and that the database users will be allowed to continue *
* work. All this does is provide a way to get at the contents of the *
* database for extraction, usually by export. It is up to you to *
* determine the amount of lost data and to correct any logical *
* corruption issues. *
* *
*************************************************************************

2) The steps to get rid of the ora-00600 are as follows:

+ Change UNDO_MANAGEMENT=AUTO to

UNDO_MANAGEMENT=MANUAL

+ Remove or comment out UNDO_TABLESPACE and UNDO_RETENTION.

+ Add
_CORRUPTED_ROLLBACK_SEGMENTS =(comma separated list of Automatic Undo segments)

Example:

_CORRUPTED_ROLLBACK_SEGMENTS = (_SYSSMU1$, _SYSSMU2$, _SYSSMU3$, _SYSSMU4$,
_SYSSMU5$, _SYSSMU6$, _SYSSMU7$, _SYSSMU8$, _SYSSMU9$, _SYSSMU10$)

Note, sometimes the alert log will tell you what Automatic Undo segments are in use. 
Search the alert log for SYSS. If the alert log does not contain that information then use _SYSSMU1$ 
through _SYSSMU10$ as shown in the example above.

In UNIX you can issue this command to get the undo segment names:

$ strings system01.dbf | grep _SYSSMU | cut -d $ -f 1 | sort -u

From the output of the strings command above, add a $ to end of each _SYSSMU undo segment name.

++ Startup mount the database as follows:

SQL > startup mount
SQL > recover database;
SQl > alter database open;

*._corrupted_rollback_segments= (_SYSSMU730$, _SYSSMU731$, _SYSSMU732$, _SYSSMU733$, _SYSSMU734$, 
_SYSSMU735$, _SYSSMU736$, _SYSSMU737$, _SYSSMU738$, _SYSSMU739$, _SYSSMU744$, _SYSSMU740$, _SYSSMU741$, 
_SYSSMU742$, _SYSSMU743$, _SYSSMU744$, _SYSSMU745$, _SYSSMU746$, _SYSSMU747$, _SYSSMU748$, _SYSSMU749$, _SYSSMU74t$, 
_SYSSMU75$, _SYSSMU750$, _SYSSMU751$, _SYSSMU752$, _SYSSMU753$, _SYSSMU754$, _SYSSMU755$, _SYSSMU756$, _SYSSMU757$, 
_SYSSMU758$, _SYSSMU759$, _SYSSMU76$, _SYSSMU760$, _SYSSMU761$, _SYSSMU762$, _SYSSMU763$, _SYSSMU764$, _SYSSMU765$, 
_SYSSMU766$, _SYSSMU767$, _SYSSMU768$)

但在完成以上设置后仍不能避免ora-00600[2667]的发生，下位errpt日志中磁盘阵列损坏信息:

errpt -a

---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR6
IDENTIFIER: B9735AF4

Date/Time: Thu Mar 10 19:01:51 THAIST 2011
Sequence Number: 106800
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: PERM
Resource Name: hdisk5
Resource Class: disk
Resource Type: array
Location: U787B.001.DNWGK9Y-P1-C4-T1-W200500A0B8484C21-L1000000000000

Description
SUBSYSTEM COMPONENT FAILURE

Probable Causes
ARRAY DASD MEDIA
POWER OR FAN COMPONENT

Failure Causes
ARRAY DASD MEDIA
POWER OR FAN COMPONENT

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0600 0308 0000 FF00 0000 0004 0000 0000 0000 0000 0000 0000 0000 0000 7000 0600
0000 0098 0000 0000 3FC6 0600 0000 0000 0000 0000 0000 D544 0000 0000 0000 0000
0008 5000 0000 0000 0000 0000 0000 0000 0000 5347 3830 3730 3033 3339 2020 2020
2020 0660 2200 0001 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0005 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 5B8C 32C1 3033 3130 3131 2F30 3630 3533 3000 0000 0000 0000 0000 0000
0000 0000 526D 6000 F205 3703 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR6
IDENTIFIER: B9735AF4

Date/Time: Thu Mar 10 19:01:51 THAIST 2011
Sequence Number: 106799
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: PERM
Resource Name: hdisk5
Resource Class: disk
Resource Type: array
Location: U787B.001.DNWGK9Y-P1-C4-T1-W200500A0B8484C21-L1000000000000

Description
SUBSYSTEM COMPONENT FAILURE

Probable Causes
ARRAY DASD MEDIA
POWER OR FAN COMPONENT

Failure Causes
ARRAY DASD MEDIA
POWER OR FAN COMPONENT

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0600 0308 0000 FF00 0000 0004 0000 0000 0000 0000 0000 0000 0000 0000 7000 0600
0000 0098 0000 0000 3FC6 0600 0000 0000 0000 0000 0000 D524 0000 0000 0000 0000
0008 5000 0000 0000 0000 0000 0000 0000 0000 5347 3830 3730 3033 3339 2020 2020
2020 0660 2200 0001 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0005 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 5B8C 32C0 3033 3130 3131 2F30 3630 3533 3000 0000 0000 0000 0000 0000
0000 0000 526D 6000 F205 3703 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR10
IDENTIFIER: C86ACB7E

Date/Time: Thu Mar 10 19:01:51 THAIST 2011
Sequence Number: 106798
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: INFO
Resource Name: dac0
Resource Class: array
Resource Type: ibm-dac-V4
Location: U787B.001.DNWGK9Y-P1-C4-T1-W200500A0B8484C21
VPD:
Manufacturer................IBM
Machine Type and Model......1814 FAStT
Part Number.................24288-00
ROS Level and ID............0916

Description
ARRAY CONFIGURATION CHANGED

Probable Causes
ARRAY CONTROLLER
CABLES AND CONNECTIONS

Failure Causes
ARRAY CONTROLLER
CABLES AND CONNECTIONS

Recommended Actions
NO ACTION NECESSARY

Detail Data
SENSE DATA
0600 0308 0000 FF00 0000 0004 0000 0000 0000 0000 0000 0000 0000 0000 7000 0600
0000 0098 0000 0000 9502 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0008 1800 0000 0000 0000 0000 0000 0000 0000 5347 3830 3730 3033 3339 2020 2020
2020 0660 2200 0001 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0005 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 5B8C 32BF 3033 3130 3131 2F30 3630 3533 3000 0000 0000 0000 0000 0000
0000 0000 526D 6000 F205 3703 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR6
IDENTIFIER: B9735AF4

Date/Time: Thu Mar 10 19:01:47 THAIST 2011
Sequence Number: 106797
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: PERM
Resource Name: hdisk4
Resource Class: disk
Resource Type: array
Location: U787B.001.DNWGK9Y-P1-C4-T1-W200500A0B8484C21-L0

Description
SUBSYSTEM COMPONENT FAILURE

Probable Causes
ARRAY DASD MEDIA
POWER OR FAN COMPONENT

Failure Causes
ARRAY DASD MEDIA
POWER OR FAN COMPONENT

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0600 0308 0000 FF00 0000 0004 0000 0000 0000 0000 0000 0000 0000 0000 7000 0600
0000 0098 0000 0000 3FC6 0600 0000 0000 0000 0000 0000 D544 0000 0000 0000 0000
0008 5000 0000 0000 0000 0000 0000 0000 0000 5347 3830 3730 3033 3339 2020 2020
2020 0660 2200 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0005 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 5B8C 3215 3033 3130 3131 2F30 3630 3532 3600 0000 0000 0000 0000 0000
0000 0000 526D 6000 F205 3703 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR6
IDENTIFIER: B9735AF4

Date/Time: Thu Mar 10 19:01:47 THAIST 2011
Sequence Number: 106796
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: PERM
Resource Name: hdisk4
Resource Class: disk
Resource Type: array
Location: U787B.001.DNWGK9Y-P1-C4-T1-W200500A0B8484C21-L0

Description
SUBSYSTEM COMPONENT FAILURE

Probable Causes
ARRAY DASD MEDIA
POWER OR FAN COMPONENT

Failure Causes
ARRAY DASD MEDIA
POWER OR FAN COMPONENT

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0600 0308 0000 FF00 0000 0004 0000 0000 0000 0000 0000 0000 0000 0000 7000 0600
0000 0098 0000 0000 3FC6 0600 0000 0000 0000 0000 0000 D524 0000 0000 0000 0000
0008 5000 0000 0000 0000 0000 0000 0000 0000 5347 3830 3730 3033 3339 2020 2020
2020 0660 2200 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0005 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 5B8C 3214 3033 3130 3131 2F30 3630 3532 3600 0000 0000 0000 0000 0000
0000 0000 526D 6000 F205 3703 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR10
IDENTIFIER: C86ACB7E

Date/Time: Thu Mar 10 19:01:47 THAIST 2011
Sequence Number: 106795
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: INFO
Resource Name: dac0
Resource Class: array
Resource Type: ibm-dac-V4
Location: U787B.001.DNWGK9Y-P1-C4-T1-W200500A0B8484C21
VPD:
Manufacturer................IBM
Machine Type and Model......1814 FAStT
Part Number.................24288-00
ROS Level and ID............0916

Description
ARRAY CONFIGURATION CHANGED

Probable Causes
ARRAY CONTROLLER
CABLES AND CONNECTIONS

Failure Causes
ARRAY CONTROLLER
CABLES AND CONNECTIONS

Recommended Actions
NO ACTION NECESSARY

Detail Data
SENSE DATA
0600 0308 0000 FF00 0000 0004 0000 0000 0000 0000 0000 0000 0000 0000 7000 0600
0000 0098 0000 0000 9502 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0008 1800 0000 0000 0000 0000 0000 0000 0000 5347 3830 3730 3033 3339 2020 2020
2020 0660 2200 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0005 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 5B8C 3213 3033 3130 3131 2F30 3630 3532 3600 0000 0000 0000 0000 0000
0000 0000 526D 6000 F205 3703 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR4
IDENTIFIER: D5385D18

Date/Time: Thu Mar 10 18:49:23 THAIST 2011
Sequence Number: 106794
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: TEMP
Resource Name: hdisk5
Resource Class: disk
Resource Type: array
Location: U787B.001.DNWGK9Y-P1-C4-T1-W200500A0B8484C21-L1000000000000

Description
ARRAY OPERATION ERROR

Probable Causes
ARRAY DASD MEDIA
ARRAY DASD DEVICE

Failure Causes
DASD MEDIA
DISK DRIVE

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0A00 2800 4644 2F00 0003 0004 0000 0000 0000 0000 0007 7497 0200 0B00 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 5262 E000 F205 3703 0000 0200 0000 0000 0000 0000 0000 0002 0000 0010
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR4
IDENTIFIER: D5385D18

Date/Time: Thu Mar 10 18:49:23 THAIST 2011
Sequence Number: 106793
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: TEMP
Resource Name: hdisk5
Resource Class: disk
Resource Type: array
Location: U787B.001.DNWGK9Y-P1-C4-T1-W200500A0B8484C21-L1000000000000

Description
ARRAY OPERATION ERROR

Probable Causes
ARRAY DASD MEDIA
ARRAY DASD DEVICE

Failure Causes
DASD MEDIA
DISK DRIVE

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0A00 2A00 132E C948 0000 0804 0000 0000 0000 0000 0007 7497 0200 0B00 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 5262 E000 F205 3703 0000 0200 0000 0000 0000 0000 0000 0002 0000 0010
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR4
IDENTIFIER: D5385D18

Date/Time: Thu Mar 10 18:49:23 THAIST 2011
Sequence Number: 106792
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: TEMP
Resource Name: hdisk5
Resource Class: disk
Resource Type: array
Location: U787B.001.DNWGK9Y-P1-C4-T1-W200500A0B8484C21-L1000000000000

Description
ARRAY OPERATION ERROR

Probable Causes
ARRAY DASD MEDIA
ARRAY DASD DEVICE

Failure Causes
DASD MEDIA
DISK DRIVE

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0A00 2A00 1335 55F0 0000 0804 0000 0000 0000 0000 0007 7497 0200 0B00 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 5262 E000 F205 3703 0000 0200 0000 0000 0000 0000 0000 0002 0000 0010
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR4
IDENTIFIER: D5385D18

Date/Time: Thu Mar 10 18:49:23 THAIST 2011
Sequence Number: 106791
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: TEMP
Resource Name: hdisk5
Resource Class: disk
Resource Type: array
Location: U787B.001.DNWGK9Y-P1-C4-T1-W200500A0B8484C21-L1000000000000

Description
ARRAY OPERATION ERROR

Probable Causes
ARRAY DASD MEDIA
ARRAY DASD DEVICE

Failure Causes
DASD MEDIA
DISK DRIVE

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0A00 2A00 136C 0F98 0000 0804 0000 0000 0000 0000 0007 7497 0200 0B00 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 5262 E000 F205 3703 0000 0200 0000 0000 0000 0000 0000 0002 0000 0010
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR4
IDENTIFIER: D5385D18

Date/Time: Thu Mar 10 18:49:23 THAIST 2011
Sequence Number: 106790
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: TEMP
Resource Name: hdisk5
Resource Class: disk
Resource Type: array
Location: U787B.001.DNWGK9Y-P1-C4-T1-W200500A0B8484C21-L1000000000000

Description
ARRAY OPERATION ERROR

Probable Causes
ARRAY DASD MEDIA
ARRAY DASD DEVICE

Failure Causes
DASD MEDIA
DISK DRIVE

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0A00 2A00 13D7 29B0 0000 0804 0000 0000 0000 0000 0007 7497 0200 0B00 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 5262 E000 F205 3703 0000 0200 0000 0000 0000 0000 0000 0002 0000 0010
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR4
IDENTIFIER: D5385D18

Date/Time: Thu Mar 10 18:49:23 THAIST 2011
Sequence Number: 106789
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: TEMP
Resource Name: hdisk5
Resource Class: disk
Resource Type: array
Location: U787B.001.DNWGK9Y-P1-C4-T1-W200500A0B8484C21-L1000000000000

Description
ARRAY OPERATION ERROR

Probable Causes
ARRAY DASD MEDIA
ARRAY DASD DEVICE

Failure Causes
DASD MEDIA
DISK DRIVE

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0A00 2A00 1337 FD88 0000 0804 0000 0000 0000 0000 0007 7497 0200 0B00 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 5262 E000 F205 3703 0000 0200 0000 0000 0000 0000 0000 0002 0000 0010
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR4
IDENTIFIER: D5385D18

Date/Time: Thu Mar 10 18:49:23 THAIST 2011
Sequence Number: 106788
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: TEMP
Resource Name: hdisk5
Resource Class: disk
Resource Type: array
Location: U787B.001.DNWGK9Y-P1-C4-T1-W200500A0B8484C21-L1000000000000

Description
ARRAY OPERATION ERROR

Probable Causes
ARRAY DASD MEDIA
ARRAY DASD DEVICE

Failure Causes
DASD MEDIA
DISK DRIVE

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0A00 2A00 136C 1F08 0000 0804 0000 0000 0000 0000 0007 7497 0200 0B00 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 5262 E000 F205 3703 0000 0200 0000 0000 0000 0000 0000 0002 0000 0010
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR4
IDENTIFIER: D5385D18

Date/Time: Thu Mar 10 18:49:23 THAIST 2011
Sequence Number: 106787
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: TEMP
Resource Name: hdisk5
Resource Class: disk
Resource Type: array
Location: U787B.001.DNWGK9Y-P1-C4-T1-W200500A0B8484C21-L1000000000000

Description
ARRAY OPERATION ERROR

Probable Causes
ARRAY DASD MEDIA
ARRAY DASD DEVICE

Failure Causes
DASD MEDIA
DISK DRIVE

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0A00 2A00 2578 E838 0000 0804 0000 0000 0000 0000 0007 7497 0200 0B00 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 5262 E000 F205 3703 0000 0200 0000 0000 0000 0000 0000 0002 0000 0010
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR4
IDENTIFIER: D5385D18

Date/Time: Thu Mar 10 18:49:23 THAIST 2011
Sequence Number: 106786
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: TEMP
Resource Name: hdisk5
Resource Class: disk
Resource Type: array
Location: U787B.001.DNWGK9Y-P1-C4-T1-W200500A0B8484C21-L1000000000000

Description
ARRAY OPERATION ERROR

Probable Causes
ARRAY DASD MEDIA
ARRAY DASD DEVICE

Failure Causes
DASD MEDIA
DISK DRIVE

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0A00 2800 4644 3300 0001 0004 0000 0000 0000 0000 0007 7497 0200 0400 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 5262 E000 F205 3703 0000 0200 0000 0000 0000 0000 0000 0002 0000 0010
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR4
IDENTIFIER: D5385D18

Date/Time: Thu Mar 10 18:49:23 THAIST 2011
Sequence Number: 106785
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: TEMP
Resource Name: hdisk5
Resource Class: disk
Resource Type: array
Location: U787B.001.DNWGK9Y-P1-C4-T1-W200500A0B8484C21-L1000000000000

Description
ARRAY OPERATION ERROR

Probable Causes
ARRAY DASD MEDIA
ARRAY DASD DEVICE

Failure Causes
DASD MEDIA
DISK DRIVE

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0A00 2800 4643 BA00 0004 0004 0000 0000 0000 0000 0007 7497 0200 0B00 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 5262 E000 F205 3703 0000 0200 0000 0000 0000 0000 0000 0002 0000 0010
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR8
IDENTIFIER: 483C9D10

Date/Time: Thu Mar 10 18:49:23 THAIST 2011
Sequence Number: 106784
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: INFO
Resource Name: dac0
Resource Class: array
Resource Type: ibm-dac-V4
Location: U787B.001.DNWGK9Y-P1-C4-T1-W200500A0B8484C21
VPD:
Manufacturer................IBM
Machine Type and Model......1814 FAStT
Part Number.................24288-00
ROS Level and ID............0916

Description
ARRAY ACTIVE CONTROLLER SWITCH

Probable Causes
ARRAY CONTROLLER
CABLES AND CONNECTIONS

Failure Causes
ARRAY CONTROLLER
CABLES AND CONNECTIONS

Recommended Actions
NO ACTION NECESSARY

Detail Data
SENSE DATA
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0400 00EE 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 526D 6000 F205 3703 0000 0000 0000 0000 0000 0000 0000 0002 0000 0001
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR3
IDENTIFIER: D9770360

Date/Time: Thu Mar 10 18:48:47 THAIST 2011
Sequence Number: 106783
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: PERM
Resource Name: dac0utm
Resource Class: NONE
Resource Type: NONE
Location:

Description
ARRAY OPERATION ERROR

Probable Causes
ARRAY DASD DEVICE
STORAGE DEVICE CABLE
EB4F

Failure Causes
DISK DRIVE
DISK DRIVE ELECTRONICS
STORAGE DEVICE CABLE
ARRAY CONTROLLER

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0A00 2800 0000 0200 0000 0104 0000 0000 0000 0000 0000 0013 0200 0B00 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 1103 8000 F205 3703 0000 0200 0000 0000 0000 0000 0040 0002 0000 0000
0000 0000
---------------------------------------------------------------------------
LABEL: FSCSI_ERR4
IDENTIFIER: 3074FEB7

Date/Time: Thu Mar 10 18:48:47 THAIST 2011
Sequence Number: 106782
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: TEMP
Resource Name: fscsi0
Resource Class: driver
Resource Type: efscsi
Location: U787B.001.DNWGK9Y-P1-C4-T1

Description
ADAPTER ERROR

Probable Causes
ADAPTER HARDWARE OR CABLE
ADAPTER MICROCODE
FIBRE CHANNEL SWITCH OR FC-AL HUB

Failure Causes
ADAPTER
CABLES AND CONNECTIONS
DEVICE

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
CHECK CABLES AND THEIR CONNECTIONS
VERIFY DEVICE CONFIGURATION

Detail Data
SENSE DATA
0000 0000 0000 00B1 0000 0045 0200 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001 0500 0000 0000
0001 0400 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 422F 0000 0812 0002 0000 0100 0000 0000 0001 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0100 2005 00A0
B848 4C21 2004 00A0 B848 4C20 0200 0000 0000 0000 0000 0000 0000 0000 0000 0000
1105 C000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR9
IDENTIFIER: 8B79A4BD

Date/Time: Thu Mar 10 18:48:45 THAIST 2011
Sequence Number: 106781
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: PERM
Resource Name: dac1
Resource Class: array
Resource Type: ibm-dac-V4
Location: U787B.001.DNWGK9Y-P1-C1-T1-W200400A0B8484C21
VPD:
Manufacturer................IBM
Machine Type and Model......1814 FAStT
Part Number.................24288-00
ROS Level and ID............0916

Description
ARRAY CONTROLLER SWITCH FAILURE

Probable Causes
ARRAY CONTROLLER
CABLES AND CONNECTIONS

Failure Causes
ARRAY CONTROLLER
CABLES AND CONNECTIONS

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0A00 5A00 2C01 0000 0002 0004 0000 0000 0000 0000 0000 0000 0200 0B00 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 526D 6000 F205 3703 0000 0000 0000 0010 0000 0000 0040 0002 0000 0000
0000 0000
---------------------------------------------------------------------------
LABEL: FSCSI_ERR4
IDENTIFIER: 3074FEB7

Date/Time: Thu Mar 10 18:48:45 THAIST 2011
Sequence Number: 106780
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: TEMP
Resource Name: fscsi1
Resource Class: driver
Resource Type: efscsi
Location: U787B.001.DNWGK9Y-P1-C1-T1

Description
ADAPTER ERROR

Probable Causes
ADAPTER HARDWARE OR CABLE
ADAPTER MICROCODE
FIBRE CHANNEL SWITCH OR FC-AL HUB

Failure Causes
ADAPTER
CABLES AND CONNECTIONS
DEVICE

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
CHECK CABLES AND THEIR CONNECTIONS
VERIFY DEVICE CONFIGURATION

Detail Data
SENSE DATA
0000 0000 0000 00B1 0000 0045 0200 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001 0100 0000 0000
0001 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 422F 0000 1212 0002 0000 0100 0000 0000 0001 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0100 2004 00A0
B848 4C21 2004 00A0 B848 4C20 0200 0000 0000 0000 0000 0000 0000 0000 0000 0000
0FFF A000
---------------------------------------------------------------------------
LABEL: FSCSI_ERR4
IDENTIFIER: 3074FEB7

Date/Time: Thu Mar 10 18:48:43 THAIST 2011
Sequence Number: 106779
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: TEMP
Resource Name: fscsi1
Resource Class: driver
Resource Type: efscsi
Location: U787B.001.DNWGK9Y-P1-C1-T1

Description
ADAPTER ERROR

Probable Causes
ADAPTER HARDWARE OR CABLE
ADAPTER MICROCODE
FIBRE CHANNEL SWITCH OR FC-AL HUB

Failure Causes
ADAPTER
CABLES AND CONNECTIONS
DEVICE

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
CHECK CABLES AND THEIR CONNECTIONS
VERIFY DEVICE CONFIGURATION

Detail Data
SENSE DATA
0000 0000 0000 00B1 0000 0045 0200 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001 0100 0000 0000
0001 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 422F 0000 1112 0002 0000 0100 0000 0000 0001 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0100 2004 00A0
B848 4C21 2004 00A0 B848 4C20 0200 0000 0000 0000 0000 0000 0000 0000 0000 0000
0FFF A000
---------------------------------------------------------------------------
LABEL: FSCSI_ERR4
IDENTIFIER: 3074FEB7

Date/Time: Thu Mar 10 18:48:42 THAIST 2011
Sequence Number: 106778
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: TEMP
Resource Name: fscsi0
Resource Class: driver
Resource Type: efscsi
Location: U787B.001.DNWGK9Y-P1-C4-T1

Description
ADAPTER ERROR

Probable Causes
ADAPTER HARDWARE OR CABLE
ADAPTER MICROCODE
FIBRE CHANNEL SWITCH OR FC-AL HUB

Failure Causes
ADAPTER
CABLES AND CONNECTIONS
DEVICE

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
CHECK CABLES AND THEIR CONNECTIONS
VERIFY DEVICE CONFIGURATION

Detail Data
SENSE DATA
0000 0000 0000 00B1 0000 0045 0200 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001 0500 0000 0000
0001 0400 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 422F 0000 0712 0002 0000 0100 0000 0000 0001 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0100 2005 00A0
B848 4C21 2004 00A0 B848 4C20 0200 0000 0000 0000 0000 0000 0000 0000 0000 0000
1105 C000
---------------------------------------------------------------------------
LABEL: FSCSI_ERR4
IDENTIFIER: 3074FEB7

Date/Time: Thu Mar 10 18:48:41 THAIST 2011
Sequence Number: 106777
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: TEMP
Resource Name: fscsi1
Resource Class: driver
Resource Type: efscsi
Location: U787B.001.DNWGK9Y-P1-C1-T1

Description
ADAPTER ERROR

Probable Causes
ADAPTER HARDWARE OR CABLE
ADAPTER MICROCODE
FIBRE CHANNEL SWITCH OR FC-AL HUB

Failure Causes
ADAPTER
CABLES AND CONNECTIONS
DEVICE

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
CHECK CABLES AND THEIR CONNECTIONS
VERIFY DEVICE CONFIGURATION

Detail Data
SENSE DATA
0000 0000 0000 00B1 0000 0045 0200 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001 0100 0000 0000
0001 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 422F 0000 1012 0002 0000 0100 0000 0000 0001 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0100 2004 00A0
B848 4C21 2004 00A0 B848 4C20 0200 0000 0000 0000 0000 0000 0000 0000 0000 0000
0FFF A000
---------------------------------------------------------------------------
LABEL: FSCSI_ERR4
IDENTIFIER: 3074FEB7

Date/Time: Thu Mar 10 18:48:39 THAIST 2011
Sequence Number: 106776
Machine Id: 000786F9D600
Node Id: p550su
Class: H
Type: TEMP
Resource Name: fscsi1
Resource Class: driver
Resource Type: efscsi
Location: U787B.001.DNWGK9Y-P1-C1-T1

Description
ADAPTER ERROR

Probable Causes
ADAPTER HARDWARE OR CABLE
ADAPTER MICROCODE
FIBRE CHANNEL SWITCH OR FC-AL HUB

Failure Causes
ADAPTER
CABLES AND CONNECTIONS
DEVICE

Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
CHECK CABLES AND THEIR CONNECTIONS
VERIFY DEVICE CONFIGURATION

Detail Data
SENSE DATA
0000 0000 0000 00B1 0000 0045 0200 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001 0100 0000 0000
0001 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 422F 0000 0F12 0002 0000 0100 0000 0000 0001 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0100 2004 00A0
B848 4C21 2004 00A0 B848 4C20 0200 0000 0000 0000 0000 0000 0000 0000 0000 0000
0FFF A000
---------------------------------------------------------------------------
LABEL: FSCSI_ERR4
IDENTIFIER: 3074FEB7

因为lgwr的crash是发生在database open后，所以实际上我们是可以在这段时间内操作数据库的，这个case最终通过新建undo tablespace代替老的问题回滚表空间解决了。

We have been repair the disk controller, but ora-600[2667] still occur when we try to open database.
After recreating the undo tablespace we were able to open the database.

Filed Under: Oracle Tagged With: _CORRUPTED_ROLLBACK_SEGMENTS, ora-00600, recover, undo

Oracle内部错误:ORA-00600:[4097]一例

2011/01/04 by admin Leave a Comment

一套Linux上的10.2.0.4系统在异常恢复后(使用_allow_resetlogs_corruption隐藏参数打开后遭遇ORA-00600:[40xx]相关的内部错误，创建并切换到了新的撤销表空间上)出现ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []内部错误，当该非内部错误(non-fatal)出现100次以上时会在告警日志alert.log中出现记录。
并有可能导致实例crash,具体日志如下:

如果自己搞不定可以找诗檀软件专业ORACLE数据库修复团队成员帮您恢复!

诗檀软件专业数据库修复团队

服务热线： 13764045638 QQ号:47079569 邮箱：service@parnassusdata.com

Errors in file /s01/10gdb/admin/clinica/bdump/clinica_smon_21463.trc:
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Tue Jan  4 23:13:19 2011
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 1 out of maximum 100 non-fatal internal errors.

clinica_smon_21463.trc:
Dump of buffer cache at level 4 for tsn=1, rdba=8388633
BH (0x91fdf428) file#: 2 rdba: 0x00800019 (2/25) class: 19 ba: 0x91c62000
  set: 3 blksize: 8192 bsi: 0 set-flg: 0 pwbcnt: 0
  dbwrid: 0 obj: -1 objn: 0 tsn: 1 afn: 2
  hash: [fcf7dd68,fcf7dd68] lru: [91fdf5b8,91fdf398]
  ckptq: [NULL] fileq: [NULL] objq: [f5b53d60,f5b53d60]
  use: [fa694970,fa694970] wait: [NULL]
  st: XCURRENT md: SHR tch: 0
  flags: gotten_in_current_mode
  LRBA: [0x0.0.0] HSCN: [0xffff.ffffffff] HSUB: [65535]
  buffer tsn: 1 rdba: 0x00800019 (2/25)
  scn: 0x0000.0352d07c seq: 0x01 flg: 0x00 tail: 0xd07c2601
  frmt: 0x02 chkval: 0x0000 type: 0x26=KTU SMU HEADER BLOCK

/* 这里dump了一个tsn=1,file#=2的数据块，
    可以看到它的类型是KTU SMU HEADER BLOCK即某个回滚段头
*/

Hex dump of block: st=0, typ_found=1
........................
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Current SQL statement for this session:
insert into smon_scn_time (thread, time_mp, time_dp, scn, scn_wrp, scn_bas,  num_mappings, tim_scn_map) 
values (0, :1, :2, :3, :4, :5, :6, :7)
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+31          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FFFF53BC160 ? 7FFFF53BC1C0 ?
                                                   7FFFF53BC100 ? 000000000 ?
ksedmp()+610         call     ksedst()             000000000 ? 000000001 ?
                                                   7FFFF53BC160 ? 7FFFF53BC1C0 ?
                                                   7FFFF53BC100 ? 000000000 ?
ksfdmp()+21          call     ksedmp()             000000003 ? 000000001 ?
                                                   7FFFF53BC160 ? 7FFFF53BC1C0 ?
                                                   7FFFF53BC100 ? 000000000 ?
kgeriv()+176         call     ksfdmp()             000000003 ? 000000001 ?
                                                   7FFFF53BC160 ? 7FFFF53BC1C0 ?
                                                   7FFFF53BC100 ? 000000000 ?
kgesiv()+119         call     kgeriv()             0068C97C0 ? 2ABDF1D42BF0 ?
                                                   000000000 ? 0F4A33EA0 ?
                                                   7FFFF53BC100 ? 000000000 ?
ksesic0()+209        call     kgesiv()             0068C97C0 ? 2ABDF1D42BF0 ?
                                                   000001001 ? 000000000 ?
                                                   7FFFF53BCEE0 ? 000000000 ?
ktugti()+3200        call     ksesic0()            000001001 ? 0068C9940 ?
                                                   000000000 ? 00000009A ?
                                                   000000010 ? 101010101010101 ?
ktsftcmove()+4149    call     ktugti()             0B73F111C ? 7FFFF53BD278 ?
                                                   7FFFF53BD280 ? 000000000 ?
                                                   7FFFF53BD27C ? 7FFFF53BD270 ?
ktsf_gsp()+1937      call     ktsftcmove()         00000000A ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   7FFFF53BD27C ? 7FFFF53BD270 ?
kdtgsp()+512         call     ktsf_gsp()           000000000 ? 7FFFF53BF460 ?
                                                   000000024 ? 000000002 ?
                                                   7FFFF53BF460 ? 000000000 ?
kdccak()+111         call     kdtgsp()             2ABDF1D6A2D8 ? 7FFF00000000 ?
                                                   2ABDF1D68530 ? 000000002 ?
                                                   7FFFF53BF460 ? 000000000 ?
kdcgcs()+5419        call     kdccak()             2ABDF1D6A2D8 ? 000000001 ?
                                                   0F4A3BBA8 ? 000000000 ?
                                                   2ABDF1D6A370 ? 000000000 ?
kdcgsp()+1372        call     kdcgcs()             2ABDF1D6A2D8 ? 000000001 ?
                                                   0F4A3BBA8 ? 000000000 ?
                                                   2ABDF1D6A370 ? 000000000 ?
kdtInsRow()+1808     call     kdcgsp()             2ABDF1D6A2D8 ? 000000001 ?
                                                   0F4A3BBA8 ? 000000000 ?
                                                   2ABDF1D6A370 ? 000000000 ?
insrow()+342         call     kdtInsRow()          2ABDF1D6A2D8 ? 000000001 ?
                                                   0F4A3BBA8 ? 000000000 ?
                                                   2ABDF1D6A370 ? 000000000 ?
insdrv()+594         call     insrow()             2ABDF1D6A2D8 ? 7FFFF53BFCC8 ?
                                                   000000000 ? 0F4A33DE0 ?
                                                   2ABDF1D6A370 ? 000000000 ?
inscovexe()+404      call     insdrv()             2ABDF1D6A2D8 ? 7FFFF53BFCC8 ?
                                                   000000000 ? 2ABDF1D6D908 ?
                                                   2ABDF1D6A370 ? 000000000 ?
insExecStmtExecIniE  call     inscovexe()          0F4A33DE0 ? 0F4A3C230 ?
ngine()+85                                         7FFFF53C0EF0 ? 2ABDF1D69F20 ?
                                                   2ABDF1D6A370 ? 000000000 ?
insexe()+386         call     insExecStmtExecIniE  0F4A33DE0 ? 0F4A3C230 ?
                              ngine()              2ABDF1D69F20 ? 2ABDF1D69F20 ?
                                                   2ABDF1D6A370 ? 000000000 ?
opiexe()+9182        call     insexe()             0F4A333A8 ? 7FFFF53C0EF0 ?
                                                   0F4A33DE0 ? 2ABDF1D69F20 ?
                                                   2ABDF1D6A370 ? 2ABDF1D69F20 ?
opiall0()+1842       call     opiexe()             000000049 ? 000000003 ?
                                                   7FFFF53C12F8 ? 000000001 ?
..............

针对该ORA-00600:[4097]内部错误，metalink上Note [ID 1030620.6]介绍了一种workaround的方法:

An ORA-600 [4097] can be encountered through various activities that use 
rollback segments.

Solution Description: 
===================== 

The most likely cause of this is BUG 427389.  This BUG is fixed in
version 7.3.3.3.  The BUG is caused when Rollback Segments are dropped and 
recreated after a shutdown abort.  It is encountered through a very specific 
set of circumstances: 

When an instance has a rollback segment offline and the instance crashes, or 
the user does a shutdown abort, the rollback segment wrap number does not get 
updated.  If that segment is then dropped and recreated immediately after the 
instance is restarted, the wrap number could be lower than existing wrap 
numbers.  This will cause the ORA-600[4097] to occur in subsequent 
transactions using Rollback. 

To avoid encountering this bug, rollback segments should only be dropped and 
recreated after the instance has been shutdown normal and restarted.  If you 
have already encountered the bug, use the following workaround:  

   Select segment_name, segment_id from dba_rollback_segs; 

   Drop all Rollback Segments except for SYSTEM.  

   Recreate dummy (small) rollback segments with the same names in their place. 

   Then, recreate additional rollback segments you want to keep with their 
   permanent storage parameters.   

   Now drop the dummy ones. This should ensure that the segment_ids are not 
   reused. 

If you ever want to add a rollback segment you have to use the workaround steps
again.  If you do not fill the dummy slots you may see the problem re-appear.

我们可以尝试drop异常恢复前已有的可能存在问题的rollback segment来规避这个问题，虽然在10g下使用AMU(automatic managed undo)但仍可以做到这一点:

SQL> alter system set "_smu_debug_mode"=4;
System altered.

/* 设置SMU debug模式为4以便能够手动管理回滚段 */

SQL> set heading off 

SQL> select 'drop rollback segment "'||segment_name||'";' from dba_rollback_segs where segment_name!='SYSTEM';

drop rollback segment "_SYSSMU1$";
drop rollback segment "_SYSSMU2$";
drop rollback segment "_SYSSMU3$";
drop rollback segment "_SYSSMU4$";
drop rollback segment "_SYSSMU5$";
drop rollback segment "_SYSSMU6$";
drop rollback segment "_SYSSMU7$";
drop rollback segment "_SYSSMU8$";
drop rollback segment "_SYSSMU9$";
drop rollback segment "_SYSSMU10$";
drop rollback segment "_SYSSMU11$";
drop rollback segment "_SYSSMU12$";
drop rollback segment "_SYSSMU13$";
drop rollback segment "_SYSSMU14$";
drop rollback segment "_SYSSMU15$";
drop rollback segment "_SYSSMU16$";
drop rollback segment "_SYSSMU17$";
drop rollback segment "_SYSSMU18$";
drop rollback segment "_SYSSMU19$";
drop rollback segment "_SYSSMU20$";
drop rollback segment "_SYSSMU21$";
drop rollback segment "_SYSSMU22$";
drop rollback segment "_SYSSMU23$";
drop rollback segment "_SYSSMU24$";
drop rollback segment "_SYSSMU25$";
drop rollback segment "_SYSSMU26$";
drop rollback segment "_SYSSMU27$";
drop rollback segment "_SYSSMU28$";
drop rollback segment "_SYSSMU29$";
drop rollback segment "_SYSSMU30$";

30 rows selected.

/* 依次执行以上的drop rollback segment回滚段的命令
    注意当前撤销表空间上的回滚段仅能offline而无法drop掉，
    实际上我们需要做的也仅仅是把之前undo表空间上有问题的回滚段drop掉
*/

SQL> alter rollback segment "_SYSSMU30$" offline;
Rollback segment altered.

SQL> drop rollback segment "_SYSSMU30$";
drop rollback segment "_SYSSMU30$"
*
ERROR at line 1:
ORA-30025: DROP segment '_SYSSMU30$' (in undo tablespace) not allowed

SQL> alter rollback segment "_SYSSMU30$" online;
Rollback segment altered.

经过以上drop问题回滚段rollback segment后，系统不再出现ORA-00600:[4097]内部错误，实例恢复正常。在系统正常后，我们有必要重置之前所设的”_smu_debug_mode”UNDO管理debug模式的隐藏参数。

Filed Under: Oracle, Oracle Bug Tagged With: ora-00600, recovery, undo

Script:收集UNDO诊断信息

2010/06/03 by admin 2 Comments

以下脚本可以用于收集Automatic Undo Management的必要诊断信息，以sysdba身份运行:

spool Undo_Diag.out  

ttitle off
set pages 999
set lines 150
set verify off 

set termout off
set trimout on
set trimspool on

REM   
REM ------------------------------------------------------------------------  
  
REM   
REM  -----------------------------------------------------------------  
REM  
  
set space 2  

REM  REPORTING TABLESPACE INFORMATION: 
REM   
REM  This looks at Tablespace Sizing - Total bytes and free bytes  
REM   
 
column tablespace_name  format a30            heading 'TS Name'  
column sbytes           format 9,999,999,999  heading 'Total MBytes'  
column fbytes           format 9,999,999,999  heading 'Free MBytes'  
column file_name        format a30            heading 'File Name'
column kount            format 999            heading 'Ext'  
 
compute sum of fbytes on tablespace_name  
compute sum of sbytes on tablespace_name  
compute sum of sbytes on report  
compute sum of fbytes on report  
 
break on tablespace_name skip 2  
 
select a.tablespace_name,  a.file_name,  round(a.bytes/1024/1024,0) sbytes,  
       round(sum(b.bytes/1024/1024),0) fbytes,  count(*) kount, autoextensible  
from   dba_data_files a,  dba_free_space b  
where  a.file_id  =  b.file_id  
and a.tablespace_name in (select z.tablespace_name from dba_tablespaces z where retention like '%GUARANTEE')
group  by a.tablespace_name, a.file_name, a.bytes, autoextensible
order  by a.tablespace_name  
/  
 
set linesize 160  
 
 
REM   
REM  If you can significantly reduce physical reads by adding incremental  
REM  data buffers...do it.  To determine whether adding data buffers will  
REM  help, set db_block_lru_statistics = TRUE and  
REM  db_block_lru_extended_statistics = TRUE in the init.ora parameters.  
REM  You can determine how many extra hits you would get from memory as  
REM  opposed to physical I/O from disk.  **NOTE:  Turning these on will  
REM  impact performance.  One shift of statistics gathering should be enough  
REM  to get the required information.  
REM   
  

REM   
REM  -----------------------------------------------------------------  
REM

set lines 160

col tablespace_name format a30 heading "Tablespace"
col tb format a15 heading "TB Status"
col df format a10 heading "DF Status"
col extent_management format a15 heading "Extent|Management"
col allocation_type format a8 heading "Type"
col segment_space_management format a7 heading "Auto|Segment"
col retention format a11 heading "Retention|Level"
col autoextensible format a5 heading "Auto?"
col mx format 999,999,999 heading "Max Allowed"

select t.tablespace_name, t.status tb, d.status df,
extent_management, allocation_type, segment_space_management, retention,
autoextensible, (maxbytes/1024/1024) mx
from dba_tablespaces t, dba_data_files d
where t.tablespace_name = d.tablespace_name
and retention like '%GUARANTEE'
/


col status format a20 head "Status"
col cnt format 999,999,999 head "How Many?"

select status, count(*) cnt
from dba_rollback_segs
group by status
/


  

set termout on
set trimout off
set trimspool off
set lines 120
set pages 999

set termout off
set trimout on
set trimspool on

alter session set nls_date_format='dd-Mon-yyyy hh24:mi';


prompt
prompt  ############## RUNTIME ############## 
prompt

col rdate head "Run Time"

select sysdate rdate from dual;

prompt 
prompt  ############## DATAFILES ############## 
prompt 

col retention head "Retention"
col tablespace_name format a30 head "TBSP Name"
col file_id format 999 head "File #"
col a format 999,999,999,999,999 head "Bytes Alloc (MB)"
col b format 999,999,999,999,999 head "Max Bytes Used (MB)"
col autoextensible head "Auto|Ext"
col extent_management head "Ext Mngmnt"
col allocation_type head "Type"
col segment_space_management head "SSM"

select tablespace_name, file_id, sum(bytes)/1024/1024 a, 
       sum(maxbytes)/1024/1024 b, 
       autoextensible
from dba_data_files
where tablespace_name in (select tablespace_name from dba_tablespaces
   where retention like '%GUARANTEE' )
group by file_id, tablespace_name, autoextensible
order by tablespace_name
/

set termout on
set trimout off
set trimspool off

ttitle off
set pages 999
set lines 150
set verify off 

set termout off
set trimout on
set trimspool on

REM   
REM ------------------------------------------------------------------------  
  
REM   
REM  -----------------------------------------------------------------  
REM  
  
REM
REM  REPORTING UNDO EXTENTS INFORMATION:  
REM   
REM  -----------------------------------------------------------------  
REM 
REM  Undo Extents breakdown information
REM

ttitle center "Rollback Segments Breakdown" skip 2

col status format a20
col cnt format 999,999,999 head "How Many?"

select status, count(*) cnt from dba_rollback_segs
group by status
/

ttitle center "Undo Extents" skip 2

col segment_name format a30 heading "Name"
col "ACT BYTES" format 999,999,999,999 head "Active|Extents"
col "UNEXP BYTES" format 999,999,999,999 head "Unxpired|Extents"
col "EXP BYTES" format 999,999,999,999 head "Expired|Extents"

select segment_name,
 nvl(sum(act),0) "ACT BYTES",
 nvl(sum(unexp),0) "UNEXP BYTES",
 nvl(sum(exp),0) "EXP BYTES"
 from (
  select segment_name,
         nvl(sum(bytes),0) act,00 unexp, 00 exp
    from DBA_UNDO_EXTENTS
   where status='ACTIVE' group by segment_name
  union
  select segment_name,
         00 act, nvl(sum(bytes),0) unexp, 00 exp
    from DBA_UNDO_EXTENTS
   where status='UNEXPIRED' group by segment_name
  union
  select segment_name,
         00 act, 00 unexp, nvl(sum(bytes),0) exp
    from DBA_UNDO_EXTENTS
   where status='EXPIRED' group by segment_name
) group by segment_name;

ttitle center "Undo Extents Statistics" skip 2

col size format 999,999,999,999 heading "Size"
col "HOW MANY" format 999,999,999 heading "How Many?"
col st heading a12 heading "Status"

select distinct status st, count(*) "HOW MANY", sum(bytes) "SIZE"
from dba_undo_extents
group by status
/

col segment_name format a30 heading "Name"
col TABLESPACE_NAME for a20
col BYTES for 999,999,999,999
col BLOCKS for 999,999,999
col status for a15 heading "Status"
col segment_name heading "Segment"
col extent_id heading "ID"


select SEGMENT_NAME, TABLESPACE_NAME, EXTENT_ID, 
      FILE_ID, BLOCK_ID, BYTES, BLOCKS, STATUS
from dba_undo_extents
order by 1,3,4,5
/


REM
REM  -----------------------------------------------------------------  
REM 
REM  Undo Extents Contention breakdown
REM  Take out column TUNED_UNDORETENTION if customer 
REM   prior to 10.2.x
REM
REM   The time frame can be adjusted with this query
REM   By default using around 4 hour window of time
REM
REM   Ex.
REM   Using sysdate-.04 looking at the last hour
REM   Using sysdate-.16 looking at the last 4 hours
REM   Using sysdate-.32 looking at the last 8 hours
REM   Using sysdate-1 looking at the last 24 hours
REM

set linesize 140

ttitle center "Undo Extents Error Conditions (Default - Last 4 Hours)" skip 2


col UNXPSTEALCNT format 999,999,999  heading "# Unexpired|Stolen"
col EXPSTEALCNT format 999,999,999   heading "# Expired|Reused"
col SSOLDERRCNT format 999,999,999   heading "ORA-1555|Error"
col NOSPACEERRCNT format 999,999,999 heading "Out-Of-space|Error"
col MAXQUERYLEN format 999,999,999   heading "Max Query|Length"
col TUNED_UNDORETENTION format 999,999,999  heading "Auto-Ajusted|Undo Retention"
col hours format 999,999 heading "Tuned|(HRs)"

select inst_id, to_char(begin_time,'MM/DD/YYYY HH24:MI') begin_time, 
     UNXPSTEALCNT, EXPSTEALCNT , SSOLDERRCNT, NOSPACEERRCNT, MAXQUERYLEN,
     TUNED_UNDORETENTION, TUNED_UNDORETENTION/60/60 hours
from gv$undostat
where begin_time between (sysdate-.16) 
                     and sysdate
order by inst_id, begin_time
/

  
set termout on
set trimout off
set trimspool off


ttitle off
set pages 999
set lines 150
set verify off 
set termout off
set trimout on
set trimspool on

REM   
REM ------------------------------------------------------------------------  
  
col name format a30  
col gets format 9,999,999  
col waits format 9,999,999  
 
PROMPT  ROLLBACK HIT STATISTICS:  
REM   
  
REM  GETS - # of gets on the rollback segment header 
REM  WAITS - # of waits for the rollback segment header  
  
set head on;  
 
select name, waits, gets  
from   v$rollstat, v$rollname  
where  v$rollstat.usn = v$rollname.usn  
/  
 
col pct head "< 2% ideal"
 
select 'The average of waits/gets is '||  
   round((sum(waits) / sum(gets)) * 100,2)||'%' PCT 
From    v$rollstat  
/  
  

  
PROMPT  REDO CONTENTION STATISTICS:

REM   
REM  If the ratio of waits to gets is more than 1% or 2%, consider  
REM  creating more rollback segments  
REM   
REM  Another way to gauge rollback contention is:  
REM   
  
column xn1 format 9999999  
column xv1 new_value xxv1 noprint  
 

 
select class, count  
from   v$waitstat  
where  class in ('system undo header', 'system undo block', 
                 'undo header',        'undo block'          )  
/  

set head off

select 'Total requests = '||sum(count) xn1, sum(count) xv1  
from    v$waitstat  
/  
 
select 'Contention for system undo header = '||  
       (round(count/(&xxv1+0.00000000001),4)) * 100||'%'  
from  v$waitstat  
where   class = 'system undo header'  
/  
 
select 'Contention for system undo block = '||  
       (round(count/(&xxv1+0.00000000001),4)) * 100||'%'  
from    v$waitstat  
where   class = 'system undo block'  
/  
 
select 'Contention for undo header = '||  
       (round(count/(&xxv1+0.00000000001),4)) * 100||'%'  
from    v$waitstat  
where   class = 'undo header'  
/  
 
select 'Contention for undo block = '||  
       (round(count/(&xxv1+0.00000000001),4)) * 100||'%'  
from    v$waitstat  
where   class = 'undo block'  
/  

REM   
REM  NOTE: Not as useful with AUM configured 
REM 
REM  If the percentage for an area is more than 1% or 2%, consider  
REM  creating more rollback segments.  Note:  This value is usually very  
REM  small 
REM  and has been rounded to 4 places.  
REM   
REM ------------------------------------------------------------------------  
  
REM   
REM  The following shows how often user processes had to wait for space in  
REM  the redo log buffer:  
  
select name||' = '||value  
from   v$sysstat  
where  name = 'redo log space requests'  
/  
 
REM   
REM  This value should be near 0.  If this value increments consistently,  
REM  processes have had to wait for space in the redo buffer.  If this  
REM  condition exists over time, increase the size of LOG_BUFFER in the  
REM  init.ora file in increments of 5% until the value nears 0.  
REM  ** NOTE: increasing the LOG_BUFFER value will increase total SGA size.  
REM   
REM  -----------------------------------------------------------------------  
  
  
col name format a15  
col gets format 9999999  
col misses format 9999999  
col immediate_gets heading 'IMMED GETS' format 9999999  
col immediate_misses heading 'IMMED MISS' format 9999999  
col sleeps format 999999  
 
PROMPT  LATCH CONTENTION:  
REM   
REM  GETS - # of successful willing-to-wait requests for a latch  
REM  MISSES - # of times an initial willing-to-wait request was unsuccessful  
REM  IMMEDIATE_GETS - # of successful immediate requests for each latch  
REM  IMMEDIATE_MISSES = # of unsuccessful immediate requests for each latch  
REM  SLEEPS - # of times a process waited and requests a latch after an  
REM           initial willing-to-wait request  
REM   
REM  If the latch requested with a willing-to-wait request is not  
REM  available, the requesting process waits a short time and requests  
REM  again.  
REM  If the latch requested with an immediate request is not available,  
REM  the requesting process does not wait, but continues processing  
REM   

set head on  
select name,          gets,              misses,  
       immediate_gets,  immediate_misses,  sleeps  
from   v$latch  
where  name in ('redo allocation',  'redo copy')  
/  

set head off 

select 'Ratio of MISSES to GETS: '||  
        round((sum(misses)/(sum(gets)+0.00000000001) * 100),2)||'%'  
from    v$latch  
where   name in ('redo allocation',  'redo copy')  
/  
 

select 'Ratio of IMMEDIATE_MISSES to IMMEDIATE_GETS: '||  
        round((sum(immediate_misses)/  
       (sum(immediate_misses+immediate_gets)+0.00000000001) * 100),2)||'%' 
from    v$latch  
where   name in ('redo allocation',  'redo copy')  
/  
 
set head on
REM   
REM  If either ratio exceeds 1%, performance will be affected.  
REM   
REM  Decreasing the size of LOG_SMALL_ENTRY_MAX_SIZE reduces the number of  
REM  processes copying information on the redo allocation latch.  
REM   
REM  Increasing the size of LOG_SIMULTANEOUS_COPIES will reduce contention  
REM  for redo copy latches.  
  
REM   
REM  -----------------------------------------------------------------  
REM  This looks at overall i/o activity against individual  
REM  files within a tablespace  
REM   
REM  Look for a mismatch across disk drives in terms of I/O  
REM   
REM  Also, examine the Blocks per Read Ratio for heavily accessed  
REM  TSs - if this value is significantly above 1 then you may have  
REM  full tablescans occurring (with multi-block I/O)  
REM   
REM  If activity on the files is unbalanced, move files around to balance  
REM  the load.  Should see an approximately even set of numbers across files  
REM   
  
set space 1  

PROMPT  REPORTING I/O STATISTICS:
 
column pbr       format 99999999  heading 'Physical|Blk Read'  
column pbw       format 999999    heading 'Physical|Blks Wrtn'  
column pyr       format 999999    heading 'Physical|Reads'  
column readtim   format 99999999  heading 'Read|Time'  
column name      format a55       heading 'DataFile Name'  
column writetim  format 99999999  heading 'Write|Time'  
 
compute sum of f.phyblkrd, f.phyblkwrt on report  
 
select fs.name name,  f.phyblkrd pbr,  f.phyblkwrt pbw, 
       f.readtim,     f.writetim  
from   v$filestat f, v$datafile fs  
where  f.file#  =  fs.file#  
order  by fs.name  
/  
 
REM   
REM  -----------------------------------------------------------------  
  
PROMPT  GENERATING WAIT STATISTICS:  
REM   
REM  This will show wait stats for certain kernel instances.  This  
REM  may show the need for additional rbs, wait lists, db_buffers  
REM   
 
column class  heading 'Class Type'  
column count  heading 'Times Waited'  format 99,999,999 
column time   heading 'Total Times'   format 99,999,999  
 
select class,  count,  time  
from   v$waitstat  
where  count > 0  
order  by class  
/  
 
REM   
REM  Look at the wait statistics generated above (if any). They will  
REM  tell you where there is contention in the system.  There will  
REM  usually be some contention in any system - but if the ratio of  
REM  waits for a particular operation starts to rise, you may need to  
REM  add additional resource, such as more database buffers, log buffers,  
REM  or rollback segments  
REM   
REM  -----------------------------------------------------------------  
  
PROMPT  ROLLBACK EXTENT STATISTICS:  
REM   


 
column usn        format 999          heading 'Undo #'
column extents    format 999          heading 'Extents'  
column rssize     format 999,999,999  heading 'Size in|Bytes'  
column optsize    format 999,999,999  heading 'Optimal|Size'  
column hwmsize    format 99,999,999   heading 'High Water|Mark'  
column shrinks    format 9,999        heading 'Num of|Shrinks'  
column wraps      format 9,999        heading 'Num of|Wraps'  
column extends    format 999,999      heading 'Num of|Extends'  
column aveactive  format 999,999,999  heading 'Average size|Active Extents'  
column rownum noprint  
 
select usn, extents, rssize,    optsize,  hwmsize,  
       shrinks,   wraps,    extends,  aveactive  
from   v$rollstat  
order  by rownum  
/  



set termout on
set trimout off
set trimspool off

set lines 120
set pages 999

set termout off
set trimout on
set trimspool on



prompt
prompt  ############## RUNTIME ############## 
prompt

col rdate head "Run Time"

select sysdate rdate from dual;

prompt 
prompt  ############## HISTORICAL DATA ############## 
prompt 

col x format 999,999 head "Max Concurrent|Last 7 Days"
col y format 999,999 head "Max Concurrent|Since Startup"

select max(maxconcurrency) x from v$undostat
/
select max(maxconcurrency) y from sys.wrh$_undostat
/

col i format 999,999 head "1555 Errors"
col j format 999,999 head "Undo Space Errors"

select sum(ssolderrcnt) i from v$undostat
where end_time > sysdate-2
/

select sum(nospaceerrcnt) j from v$undostat
where end_time > sysdate-2
/

prompt 
prompt  ############## CURRENT STATUS OF SEGMENTS  ############## 
prompt  ##############   SNAPSHOT IN TIME INFO     ##############
prompt  ##############(SHOWS CURRENT UNDO ACTIVITY)##############
prompt 

col segment_name format a30 head "Segment Name"
col "ACT BYTES" format 999,999,999,999 head "Active Bytes"
col "UNEXP BYTES" format 999,999,999,999 head "Unexpired Bytes"
col "EXP BYTES" format 999,999,999,999 head "Expired Bytes"

select segment_name, nvl(sum(act),0) "ACT BYTES", 
  nvl(sum(unexp),0) "UNEXP BYTES",
  nvl(sum(exp),0) "EXP BYTES"
from (select segment_name, nvl(sum(bytes),0) act,00 unexp, 00 exp
   from dba_undo_extents where status='ACTIVE' group by segment_name
union 
select segment_name, 00 act, nvl(sum(bytes),0) unexp, 00 exp
from dba_undo_extents where status='UNEXPIRED' group by segment_name
union
select segment_name, 00 act, 00 unexp, nvl(sum(bytes),0) exp
from dba_undo_extents where status='EXPIRED' group by segment_name)
group by segment_name
order by 1
/

prompt 
prompt  ############## UNDO SPACE USAGE ############## 
prompt 

col usn format 999,999 head "Segment#"
col shrinks format 999,999,999 head "Shrinks"
col aveshrink format 999,999,999 head "Avg Shrink Size"

select usn, shrinks, aveshrink from v$rollstat
/
set termout on
set trimout off
set trimspool off
set pages 999

set termout off
set trimout on
set trimspool on


prompt
prompt  ############## RUNTIME ############## 
prompt

col rdate head "Run Time"

select sysdate rdate from dual;

col inst_id format 999 head "Instance #"
col Parameter format a35 wrap
col "Session Value" format a25 wrapped
col "Instance Value" format a25 wrapped

prompt
prompt  ############## PARAMETERS ############## 
prompt

select  a.inst_id, a.ksppinm  "Parameter",
             b.ksppstvl "Session Value",
             c.ksppstvl "Instance Value"
      from x$ksppi a, x$ksppcv b, x$ksppsv c
     where a.indx = b.indx and a.indx = c.indx
       and a.inst_id=b.inst_id and b.inst_id=c.inst_id
       and a.ksppinm in ('_undo_autotune', '_smu_debug_mode',
                         '_highthreshold_undoretention',
                'undo_tablespace','undo_retention','undo_management')
order by 2;

set termout on
set trimout off
set trimspool off
set pages 999

set termout off
set trimout on
set trimspool on

prompt
prompt  ############## RUNTIME ############## 
prompt

col rdate head "Run Time"

select sysdate rdate from dual;

prompt 
prompt  ############## WAITS FOR UNDO (Since Startup) ############## 
prompt 

col inst_id head "Instance#"
col eq_type format a3 head "Enq"
col total_req# format 999,999,999,999,999,999 head "Total Requests"
col total_wait# format 999,999 head "Total Waits"
col succ_req# format 999,999,999,999,999,999 head "Successes"
col failed_req# format 999,999,999999 head "Failures"
col cum_wait_time format 999,999,999 head "Cummalitve|Time"

select * from v$enqueue_stat where eq_type='US'
union
select * from v$enqueue_stat where eq_type='HW'
/

prompt 
prompt  ############## LOCKS FOR UNDO ############## 
prompt 

col addr head "ADDR"
col KADDR head "KADDR"
col sid head "Session"
col osuser format a10 head "OS User"
col machine format a15 head "Machine"
col program format a17 head "Program"
col process format a7 head "Process"
col lmode head "Lmode"
col request head "Request"
col ctime format 9,999 head "Time|(Mins)"
col block head "Blocking?"

select /*+ RULE */  a.SID, b.process,
b.OSUSER,  b.MACHINE,  b.PROGRAM, 
addr, kaddr, lmode, request, round(ctime/60/60,0) ctime, block 
from 
v$lock a, 
v$session b 
where 
a.sid=b.sid
and a.type='US'
/

prompt 
prompt  ############## TUNED RETENTION HISTORY (Last 2 Days) ############## 
prompt  ##############        LOWEST AND HIGHEST DATA        ############## 
prompt 

col low format 999,999,999,999 head "Undo Retention|Lowest Tuned Value"
col high format 999,999,999,999 head "Undo Retention|Highest Tuned Value"

select end_time, tuned_undoretention from v$undostat where tuned_undoretention = (
select min(tuned_undoretention) low
from v$undostat
where end_time > sysdate-2)
/

select end_time, tuned_undoretention from v$undostat where tuned_undoretention = (
select max(tuned_undoretention) high
from v$undostat
where end_time > sysdate-2)
/

prompt 
prompt  ############## CURRENT TRANSACTIONS ############## 
prompt 

col sql_text format a40 word_wrapped head "SQL Code"

select a.start_date, a.start_scn, a.status, c.sql_text
from v$transaction a, v$session b, v$sqlarea c
where b.saddr=a.ses_addr and c.address=b.sql_address
and b.sql_hash_value=c.hash_value
/

select current_scn from v$database
/

col a format 999,999 head "UnexStolen"
col b format 999,999 head "ExStolen"
col c format 999,999 head "UnexReuse"
col d format 999,999 head "ExReuse"

prompt 
prompt  ############## WHO'S STEALING WHAT? (Last 2 Days) ############## 
prompt 

select unxpstealcnt a, expstealcnt b,
  unxpblkreucnt c, expblkreucnt d
from v$undostat
where (unxpstealcnt > 0 or expstealcnt > 0)
and end_time > sysdate-2
/

set termout on
set trimout off
set trimspool off
set pages 999

set termout off
set trimout on
set trimspool on

prompt
prompt  ############## RUNTIME ############## 
prompt

col rdate head "Run Time"

select sysdate rdate from dual;

col current_scn head "SCN Now"
col start_date head "Trans Started"
col start_scn head "SCN for Trans"
col ses_addr head "ADDR"

prompt 
prompt  ############## Historical V$UNDOSTAT (Last 2 Days) ############## 
prompt 


col end_time format a18 Head "Date/Time"
col maxq format 999,999 head "Query|Maximum|Minutes"
col maxquerysqlid head "SqlID"
col undotsn format 999,999 head "TBS"
col undoblks format 999,999,999 head "Undo|Blocks"
col txncount format 999,999,999 head "# of|Trans"
col unexpiredblks format 999,999,999 head "# of Unexpired"
col expiredblks format 999,999,999 head "# of Expired"
col tuned format 999,999 head "Tuned Retention|(Minutes)"

select end_time, round(maxquerylen/60,0) maxq, maxquerysqlid,
undotsn, undoblks, txncount, unexpiredblks, expiredblks, 
round(tuned_undoretention/60,0) Tuned
from dba_hist_undostat
where end_time > sysdate-2
order by 1
/

prompt 
prompt  ############## RECENT MISSES FOR UNDO (Last 2 Days) ############## 
prompt 

set lines 500
select * from v$undostat where maxquerylen > tuned_undoretention
and end_time > sysdate-2
order by 2
/

select * from sys.wrh$_undostat where maxquerylen > tuned_undoretention
and end_time > sysdate-2
order by 2
/

prompt 
prompt  ############## AUTO-TUNING TUNE-DOWN DATA    ############## 
prompt  ############## ROLLBACK DATA (Since Startup) ############## 
prompt 

col name format a60 head "Name"
col value format 999,999,999 head "Counters"

select name, value from v$sysstat
where name like '%down retention%' or name like 'une down%'
or name like '%undo segment%' or name like '%rollback%'
or name like '%undo record%'
/

prompt 
prompt  ############## Long Running Query History ############## 
prompt 

col end_time head "Date"
col maxquerysqlid head "SQL ID"
col runawayquerysqlid format a15 head "Runaway SQL ID"
col results format a35 word_wrapped head "Space Issues"
col status head "Status"
col newret head "Tuned Down|Retention"

select end_time, maxquerysqlid, runawayquerysqlid, status,
        decode(status,1,'Slot Active',4,'Reached Best Retention',5,'Reached Best Retention',
                    8, 'Runaway Query',9,'Runaway Query-Active',10,'Space Pressure',
                   11,'Space Pressure Currently',
                   16, 'Tuned Down (to undo_retention) due to Space Pressure', 
                   17,'Tuned Down (to undo_retention) due to Space Pressure-Active',
                   18, 'Tuning Down due to Runaway', 19, 'Tuning Down due to Runaway-Active',
                   28, 'Runaway tuned down to last tune down value',
                   29, 'Runaway tuned down to last tune down value',
                   32, 'Max Tuned Down - Not Auto-Tuning',
                   33, 'Max Tuned Down - Not Auto-Tuning (Active)',
                   37, 'Max Tuned Down - Not Auto-Tuning (Active)', 
                   38, 'Max Tuned Down - Not Auto-Tuning', 
                   39, 'Max Tuned Down - Not Auto-Tuning (Active)', 
                   40, 'Max Tuned Down - Not Auto-Tuning', 
                   41, 'Max Tuned Down - Not Auto-Tuning (Active)', 
                   42, 'Max Tuned Down - Not Auto-Tuning', 
                   44, 'Max Tuned Down - Not Auto-Tuning', 
                   45, 'Max Tuned Down - Not Auto-Tuning (Active)', 
                   'Other ('||status||')') Results, spcprs_retention NewRet
from sys.wrh$_undostat
where status > 1
/



prompt 
prompt  ############## Details on Long Run Queries ############## 
prompt 

col sql_fulltext head "SQL Text"
Col sql_id heading "SQL ID"

select sql_id, sql_fulltext, last_load_time "Last Load", 
round(elapsed_time/60/60/24,0) "Elapsed Days" 
from v$sql where sql_id in 
(select maxquerysqlid from sys.wrh$_undostat 
where status > 1)
/

set termout on
set trimout off
set trimspool off
set pages 999

set termout off
set trimout on
set trimspool on

prompt
prompt  ############## RUNTIME ############## 
prompt

col rdate head "Run Time"

select sysdate rdate from dual;

prompt 
prompt  ############## IN USE Undo Data ############## 
prompt 

select 
((select (nvl(sum(bytes),0)) 
from dba_undo_extents 
where tablespace_name in (select tablespace_name from dba_tablespaces
   where retention like '%GUARANTEE' )
and status in ('ACTIVE','UNEXPIRED')) *100) / 
(select sum(bytes) 
from dba_data_files 
where tablespace_name in (select tablespace_name from dba_tablespaces
   where retention like '%GUARANTEE' )) "PCT_INUSE" 
from dual; 


select tablespace_name, extent_management, allocation_type,
segment_space_management, retention
from dba_tablespaces where retention like '%GUARANTEE'
/

col c format 999,999,999,999 head "Sum of Free"

select (nvl(sum(bytes),0)) c from dba_free_space
where tablespace_name in
(select tablespace_name from dba_tablespaces where retention like '%GUARANTEE')
/

col d format 999,999,999,999 head "Total Bytes"

select sum(bytes) d from dba_data_files
where tablespace_name in
(select tablespace_name from dba_tablespaces where retention like '%GUARANTEE')
/


PROMPT
PROMPT  ############## UNDO SEGMENTS ############## 
PROMPT

col status head "Status"
col z format 999,999 head "Total Extents"
break on report
compute sum on report of z

select status, count(*) z from dba_undo_extents
group by status
/

col z format 999,999 head "Undo Segments"

select status, count(*) z from dba_rollback_segs
group by status
/


prompt 
prompt  ############## CURRENT STATUS OF SEGMENTS  ############## 
prompt  ##############   SNAPSHOT IN TIME INFO     ##############
prompt  ##############(SHOWS CURRENT UNDO ACTIVITY)##############
prompt 


col segment_name format a30 head "Segment Name"
col "ACT BYTES" format 999,999,999,999 head "Active Bytes"
col "UNEXP BYTES" format 999,999,999,999 head "Unexpired Bytes"
col "EXP BYTES" format 999,999,999,999 head "Expired Bytes"

select segment_name, nvl(sum(act),0) "ACT BYTES", 
  nvl(sum(unexp),0) "UNEXP BYTES",
  nvl(sum(exp),0) "EXP BYTES"
from (select segment_name, nvl(sum(bytes),0) act,00 unexp, 00 exp
   from dba_undo_extents where status='ACTIVE' group by segment_name
union 
select segment_name, 00 act, nvl(sum(bytes),0) unexp, 00 exp
from dba_undo_extents where status='UNEXPIRED' group by segment_name
union
select segment_name, 00 act, 00 unexp, nvl(sum(bytes),0) exp
from dba_undo_extents where status='EXPIRED' group by segment_name)
group by segment_name
order by 1
/

prompt 
prompt  ############## UNDO SPACE USAGE ############## 
prompt 

col usn format 999,999 head "Segment#"
col shrinks format 999,999,999 head "Shrinks"
col aveshrink format 999,999,999 head "Avg Shrink Size"

select usn, shrinks, aveshrink from v$rollstat
/
set termout on
set trimout off
set trimspool off
spool off

Filed Under: Oracle, Oracle脚本script Tagged With: diagnostic, Oracle脚本script, undo

诗檀软件专业数据库修复团队

服务热线 ： 13764045638 QQ号:47079569 邮箱：service@parnassusdata.com

诗檀软件专业数据库修复团队

服务热线 ： 13764045638 QQ号:47079569 邮箱：service@parnassusdata.com

诗檀软件专业数据库修复团队

服务热线 ： 13764045638 QQ号:47079569 邮箱：service@parnassusdata.com

诗檀软件专业数据库修复团队

服务热线 ： 13764045638 QQ号:47079569 邮箱：service@parnassusdata.com

服务热线： 13764045638 QQ号:47079569 邮箱：service@parnassusdata.com

服务热线： 13764045638 QQ号:47079569 邮箱：service@parnassusdata.com

服务热线： 13764045638 QQ号:47079569 邮箱：service@parnassusdata.com

服务热线： 13764045638 QQ号:47079569 邮箱：service@parnassusdata.com