【Oracle数据恢复】ORA-600[4194]错误一例

ORA-600[4194]内部错误一般由重做记录与回滚记录不匹配引发。Oracle在验证Undo record number时，会对比redo change 和回滚段中的undo record number，若发现2者存在差异则报该4194错误。其错误argument[a][b]，a代表回滚块中的最大undo record number，b代表重做日志中记录的undo record number。这个错误可能由回滚段或者redo log日志文件讹误引起。

ORA-00600[4194]错误的根本原因是 redo记录与回滚段(rollback/undo)记录之间的不一致。当ORACLE在验证undo记录时相对应的变化需要应用到undo数据块的最大undo记录上，此时若检验出错则会报ORA-00600[4194]

此错误不像ORA-600[2662]或ORA-600[4000]错误那样必然导致数据库无法打开，因为它很少出现在前滚阶段；当数据库被打开，smon开始执行事务恢复或一些回滚段的管理工作时则很有可能触发该错误。

ORA-600[4194]的2个的含义：

Arg [a] Maximum Undo record number in Undo block
Arg [b] Undo record number from Redo block

这个ORA-600[4194] 报错属于ORACLE内核从cache层到事务undo处理，可能的影响是进程失败或者可能的回滚段坏块。

可能的bug 包括：

8240762 10.2.0.5,
11.1.0.7.10,
11.2.0.1
Undo corruptions with ORA-600 [4193]/ORA-600 [4194] or ORA-600 [4137] /
SMON may spin to recover transaction

3210520 9.2.0.5, 10.1.0.2 OERI[kjccqmg:esm] / OERI[4194] / corruption possible in RAC

792610 8.0.6.0, 8.1.6.0 Rollback segment corruption

对于非自举对象non-bootstrap对象对应的undo记录可以通过如下方法搞定，如果涉及到的对象是bootstrap系统对象则可能需要手动 bbed来修复，如果自己搞不定可以找ASKMACLEAN专业数据库修复团队成员帮您恢复：

来具体看一下错误记录:

Thu Aug 26 18:58:50 2010
Errors in file /s01/10gdb/admin/YOUYUS/bdump/youyus_smon_6587.trc:
ORA-01595: error freeing extent (3) of rollback segment (4))
ORA-00600: internal error code, arguments: [4194], [53], [41], [], [], [], [], []
Thu Aug 26 18:58:50 2010
..............
Errors in file /s01/10gdb/admin/YOUYUS/bdump/youyus_j000_6630.trc:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 2 change 1617922 time 08/26/2010 18:35:39
ORA-00334: archived log: '/s01/10gdb/flash_recovery_area/YOUYUS/onlinelog/o1_mf_3_65psr4on_.log'
Thu Aug 26 19:00:31 2010
Errors in file /s01/10gdb/admin/YOUYUS/bdump/youyus_j000_6630.trc:
ORA-00600: internal error code, arguments: [4194], [53], [41], [], [], [], [], []
Thu Aug 26 19:00:34 2010
Errors in file /s01/10gdb/admin/YOUYUS/bdump/youyus_j000_6630.trc:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 2 change 1617922 time 08/26/2010 18:35:39
ORA-00334: archived log: '/s01/10gdb/flash_recovery_area/YOUYUS/onlinelog/o1_mf_3_65psr4on_.log'
ORA-00600: internal error code, arguments: [4194], [53], [41], [], [], [], [], []
Thu Aug 26 19:00:35 2010
Errors in file /s01/10gdb/admin/YOUYUS/bdump/youyus_j000_6630.trc:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 2 change 1617922 time 08/26/2010 18:35:39
ORA-00334: archived log: '/s01/10gdb/flash_recovery_area/YOUYUS/onlinelog/o1_mf_3_65psr4on_.log'

ORA-00600: internal error code, arguments: [4194], [53], [41], [], [], [], [], []

如果你因为ORA-600[4194]错误导致数据库无法打开，那么可以尝试设置以下事件:

SQL> alter system set event='10513 trace name context forever,level 2 : 10512 trace name context forever,level 1: 10511 trace name context forever,level 2: 10510 trace name context forever,level 1' scope=spfile;
System altered.

/* 10513事件用以阻止SMON在启动数据库后执行事务恢复(transaction recovery) */
/* 10512事件用以阻止SMON shrink rollback segment */
/* 10511事件用以阻止SMON check to cleanup undo dictionary */
/* 10500事件用以阻止SMON check to offline pending offline rollback segment */

SQL> alter system set undo_management=MANUAL scope=spfile;
System altered.

SQL> shutdown immediate;
ORA-03113: end-of-file on communication channel

SQL> startup mount;
ORACLE instance started.

Total System Global Area 2634022912 bytes
Fixed Size 2086288 bytes
Variable Size 2382367344 bytes
Database Buffers 234881024 bytes
Redo Buffers 14688256 bytes
Database mounted.
SQL> alter database open;

Database altered.

SQL> create undo tablespace undoc datafile size 300M;

SQL> alter system set undo_management=AUTO scope=spfile;
System altered.

SQL> alter system set undo_tablespace=undoc scope=spfile;
System altered.

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.

Total System Global Area 2634022912 bytes
Fixed Size 2086288 bytes
Variable Size 2382367344 bytes
Database Buffers 234881024 bytes
Redo Buffers 14688256 bytes
Database mounted.

SQL> alter database open;
Database altered.

/* 通过重建undo表空间可以避免一些4194错误，但不是全部 */

/* 这个库目前处于随时会crash的不可控状态，我们必须要导出数据并导入到新库中 * /

/* 这种情况下direct方式 可能可以规避一些意外错误 */

[maclean@rh2 dump]$ exp maclean/maclean file=full_maclean.dmp owner=maclean direct=y statistics=none
Export: Release 10.2.0.4.0 - Production on Thu Aug 26 21:18:40 2010
Copyright (c) 1982, 2007, Oracle. All rights reserved.
Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in ZHS16GBK character set and UTF8 NCHAR character set

About to export specified users ...
. exporting pre-schema procedural objects and actions
. exporting foreign function library names for user MACLEAN
. exporting PUBLIC type synonyms
. exporting private type synonyms
. exporting object type definitions for user MACLEAN
About to export MACLEAN's objects ...
. exporting database links
. exporting sequence numbers
. exporting cluster definitions
. about to export MACLEAN's tables via Direct Path ...
Table SYS_EXPORT_TABLE_01 will be exported in conventional path.
. . exporting table SYS_EXPORT_TABLE_01 256 rows exported
Table SYS_EXPORT_TABLE_02 will be exported in conventional path.
. . exporting table SYS_EXPORT_TABLE_02 257 rows exported
Table SYS_EXPORT_TABLE_03 will be exported in conventional path.
..............
exporting refresh groups and children
. exporting dimensions
. exporting post-schema procedural objects and actions
. exporting statistics
Export terminated successfully with warnings.

/* we are lucky! */

Comments

admin says

2010/08/26 at 21:25

Problem Description:
====================
Your users cannot log onto the database. You see the following errors in
the alert log: Ora-600 [4194], Ora-1578, and Ora-1110.

ORA-600 [4194] [a] [b] [ ] [ ] [ ]
Meaning: undo record number mismatch while adding an undo record to an undo
block. This is done by the application of redo.

ORA-01578: ORACLE data block corrupted (file # %s, block # %s)”
Cause: The data block indicated was corrupted, mostly due to software
errors.
Action: Try to restore the segment containing the block indicated. This
may involve dropping the segment and recreating it. If there
is a trace file, report the errors in it to your ORACLE
representative.

ORA-01110: “data file %s: ‘%s'”
Cause: Reporting file name for details of another error
Action: See associated error message

Problem Explanation:
====================

Oracle is unable to roll back an uncommitted transaction in its transaction
table. You have corruption in the rollback segment or corruption in an object
which the rollback segment is trying to apply undo record.

Problem References:
===================

Note:39283.1 OERR: 600 4194 undo record number mismatch while adding undo
record

Search Words:
=============

CORRUPTION, ROLLBACK, SEGMENT, CORRUPT, NEEDS, RECOVERY, ORA-1545,
Ora-600 [4137], Ora-600 4193]

Solution Description:
=====================

In all cases, the ideal solution for a corrupted ROLLBACK SEGMENT is to
restore your backup and do a point in time recovery to just prior to when
the problem occurred.

Solution Explanation:
=====================

Recover to a point in time before corruption occurred.

PS:
Hdr: 8240762 10.2.0.4 RDBMS 10.2.0.4 TXN MGMT LOCAL PRODID-5 PORTID-212 ORA-600
Abstract: UNDO CORRUPTION DURING SHRINK CAUSES ORA-600 [4193]

RELEASE NOTES:
]]Undo corruption during shrink causes ORA-600 [4193]

REDISCOVERY INFORMATION:
If you see an ORA-600 [4193] and the same undo block is used for two differen
t transactions (due to some corruption in the Free Block Pool during shrink ope
rations), then you may have hit this bug.
WORKAROUND:
None

回复
admin says

2010/08/26 at 21:27

PURPOSE:
This article discusses the internal error “ORA-600 [4194]”, what
it means and possible actions. The information here is only applicable
to the versions listed and is provided only for guidance.

ERROR:
ORA-600 [4194] [a] [b]

VERSIONS:
versions 6.0 to 10.1

DESCRIPTION:

A mismatch has been detected between Redo records and rollback (Undo)
records.

We are validating the Undo record number relating to the change being
applied against the maximum undo record number recorded in the undo block.

This error is reported when the validation fails.

ARGUMENTS:
Arg [a] Maximum Undo record number in Undo block
Arg [b] Undo record number from Redo block

FUNCTIONALITY:
Kernel Transaction Undo called from Cache layer

IMPACT:
PROCESS FAILURE
POSSIBLE ROLLBACK SEGMENT CORRUPTION

SUGGESTIONS:

This error may indicate a rollback segment corruption.

This may require a recovery from a database backup depending on
the situation.

If the Known Issues section below does not help in terms of identifying
a solution, please submit the trace files and alert.log to Oracle
Support Services for further analysis.

回复
admin says

2011/07/06 at 23:01

xid: 0x00d4.02a.0007d2f0 –>> usn 212

1- select segment_id, segment_name, status, file_id from dba_rollback_segs;

Find the segment_id = 212 and take the segment_name of it and do:

2- ALTER SYSTEM DUMP UNDO header ““;

Upload the trace file generated and the output of the first query (in a txt file)

回复
Ask_Maclean_liu_Oracle says

2015/02/14 at 21:58

APPLIES TO:Oracle Database – Enterprise Edition – Version 9.2.0.1 to 11.2.0.3 [Release 9.2 to 11.2]Information in this document applies to any platform.***Checked for relevance on 04-Dec-2013***SYMPTOMSThe following error is occurring in the alert.log right before the database crashes.ORA-00600: internal error code, arguments: [4194], [#], [#], [], [], [], [], []This error indicates that a mismatch has been detected between redo records and rollback (undo) records.ARGUMENTS:Arg – Maximum Undo record number in Undo blockArg – Undo record number from Redo blockSince we are adding a new undo record to our undo block, we would expect that the new record number is equal to the maximum record number in the undo block plus one. Before Oracle can add a new undo record to the undo block it validates that this is correct. If this validation fails, then an ORA-600 [4194] will be triggered.CHANGESThis issue generally occurs when there is a power outage or hardware failure that initially crashes the database. On startup, the database does the normal roll forward (redo) and then rollback (undo), this is where the error is generated on the rollback.CAUSEThis also can be cause by the following defectBug 8240762 Abstract: Undo corruptions with ORA-600 [4193]/ORA-600 [4194] or ORA-600 [4137] after SHRINKDetails: Undo corruption may be caused after a shrink and the same undo block may be used for two different transactions causing several internal errors like:ORA-600 [4193] / ORA-600 [4194] for new transactionsORA-600 [4137] for a transaction rollbackSOLUTIONBest practice to create a new undo tablespace.This method includes segment check.Create pfile from spfile to edit>create pfile from spfile;1. Shutdown the instance2. set the following parameters in the pfile undo_management = manual event = ‘10513 trace name context forever, level 2’3. >startup restrict pfile=4. >select tablespace_name, status, segment_name from dba_rollback_segs where status != ‘OFFLINE’;This is critical – we are looking for all undo segments to be offline – System will always be online.If any are ‘PARTLY AVAILABLE’ or ‘NEEDS RECOVERY’ – Please open an issue with Oracle Support or update the current SR. There are many options from this moment and Oracle Support Analyst can offer different solutions for the bad undo segments.If all offline then continue to the next step5. Create new undo tablespace – example>create undo tablespace datafile size 2000M;6. Drop old undo tablespace>drop tablespace including contents and datafiles;7. >shutdown immediate;8 >startup mount;9 modify the pfile with the new undo tablespace name>alter system set undo_tablespace = ” scope=pfile;10. >shutdown immediate;11. >startup; Startup using the normal spfile================The reason we create a new undo tablespace first is to use new undo segment numbers that are higher then the current segments being used. This way when a transaction goes to do block clean-out the reference to that undo segment does not exist and continues with the block clean-out.

回复
Ask_Maclean_liu_Oracle says

2015/02/14 at 21:58

APPLIES TO:Oracle Database – Enterprise Edition – Version 9.2.0.1 to 11.2.0.3 [Release 9.2 to 11.2]Information in this document applies to any platform.***Checked for relevance on 04-Dec-2013***SYMPTOMSThe following error is occurring in the alert.log right before the database crashes.ORA-00600: internal error code, arguments: [4194], [#], [#], [], [], [], [], []This error indicates that a mismatch has been detected between redo records and rollback (undo) records.ARGUMENTS:Arg – Maximum Undo record number in Undo blockArg – Undo record number from Redo blockSince we are adding a new undo record to our undo block, we would expect that the new record number is equal to the maximum record number in the undo block plus one. Before Oracle can add a new undo record to the undo block it validates that this is correct. If this validation fails, then an ORA-600 [4194] will be triggered.CHANGESThis issue generally occurs when there is a power outage or hardware failure that initially crashes the database. On startup, the database does the normal roll forward (redo) and then rollback (undo), this is where the error is generated on the rollback.CAUSEThis also can be cause by the following defectBug 8240762 Abstract: Undo corruptions with ORA-600 [4193]/ORA-600 [4194] or ORA-600 [4137] after SHRINKDetails: Undo corruption may be caused after a shrink and the same undo block may be used for two different transactions causing several internal errors like:ORA-600 [4193] / ORA-600 [4194] for new transactionsORA-600 [4137] for a transaction rollbackSOLUTIONBest practice to create a new undo tablespace.This method includes segment check.Create pfile from spfile to edit>create pfile from spfile;1. Shutdown the instance2. set the following parameters in the pfile undo_management = manual event = ‘10513 trace name context forever, level 2’3. >startup restrict pfile=4. >select tablespace_name, status, segment_name from dba_rollback_segs where status != ‘OFFLINE’;This is critical – we are looking for all undo segments to be offline – System will always be online.If any are ‘PARTLY AVAILABLE’ or ‘NEEDS RECOVERY’ – Please open an issue with Oracle Support or update the current SR. There are many options from this moment and Oracle Support Analyst can offer different solutions for the bad undo segments.If all offline then continue to the next step5. Create new undo tablespace – example>create undo tablespace datafile size 2000M;6. Drop old undo tablespace>drop tablespace including contents and datafiles;7. >shutdown immediate;8 >startup mount;9 modify the pfile with the new undo tablespace name>alter system set undo_tablespace = ” scope=pfile;10. >shutdown immediate;11. >startup; Startup using the normal spfile================The reason we create a new undo tablespace first is to use new undo segment numbers that are higher then the current segments being used. This way when a transaction goes to do block clean-out the reference to that undo segment does not exist and continues with the block clean-out.

回复

【Oracle数据恢复】ORA-600[4194]错误一例

Comments

Comment 取消回复