Oracleデータリカバリ

プロのOracle Databaseの復旧サービスを提供
携帯番号: +86 13764045638 メール:service@parnassusdata.com

 

 

バックアップリカバリはOracleに永遠の話。データさえあれば、リカバリする必要が現れる。けど、中国でバックアップはいつも無視された。これで、東と西の相違が現れた。海外のDBAはいつもOracle内部的な原理やオプティマイザに専念している。

 

DBAの習得する必要があるわざの一つ:バックアップなしにOracleデータベースのデータをリカバリする技術。海外ではかなり高級なわざだが、いつの間にか DBAに必要としているわざになった。

 

もちろん、バックアップなしにデータをリカバリするのはOracleデータファイル、データブロック及びデータディクショナリーに対する詳しい知識が必要としている。

 

ここではOracleデータリカバリの文である:何のバックアップもない状況でリカバリできる。例えば、DULとBBEDツールリカバリなどの技術。

 

 

【データリカバリ】ROWIDを構造することで、バックアップなしにORA-1578、ORA-8103、ORA-1410などのこわれたブロックトラブルを避けられる

【データリカバリ】ORA-600[kccpb_sanity_check_2]一例

Oracle rmanでset newnameが遅いかもしれない

どうやってOracleコントロールファイルの無駄記録を削除できるか。例えばv$archived_logのdeletedアーカイブログ記録

create or replaceに上書きされたPL/SQL对をどうやって取り戻せる

Archivelog Completed Before VS UNTIL TIME

ASMがdisk headerをなくしたから、ORA-15032、ORA-15040、ORA-15042 になって、Diskgroupがmountできなくなった

Overcome ORA-600[4xxx] open database

データリカバリ:ORA-600[kghstack_free2][kghstack_err+0068]解決例

RMAN Catalogリカバリディレクトリ

 datafile nameに文字化けを含むデータファイルをどうやって再命名できるか

11g新特性recover corruption list

ORA-01578エラ解決例

Script:リカバリ診断情報を収集する

どうやってSYSAUXテーブルスペースの対象をリカバリできるか

Oracleデータリカバリ:ORA-00600:[4000] ORA-00704: bootstrap process failureエラ解決例

Script:データベースにバックアップしているかを確認してください

データリカバリ: 二つのこわれたブロックをシミュレーションする

Script:Oracleバックアップリカバリ情報を収集する

Oracleバックアップリカバリ:Rman Backupディレイトラブル一例

rman catalogの互換性を理解しよう

Oracle内部的なエラ:ORA-00600[2608]一例

bbedでORA-01189エラを解決する

Fractured block found during backing up datafile

人工的にSCN番号を増やす方法:How to increase System Change Number by manual

DBMS_REPAIR example

 

 

Oracle損害とこわれたブロックは主に以下の通り:

 

ORA-1578
ORA-8103
ORA-1410
ORA-1499
ORA-1578
ORA-81##
ORA-14##
ORA-26040
ORA-600 Errors
Block Corruption
Index Corruption
Row Corruption
UNDO Corruption
Control File
Consistent Read
Dictionary
File/RDBA/BL

 

 

Error Description Corruption related to:
ORA-1578 ORA-1578 is reported when a block is thought to be corrupt on read. Block
OERR: ORA-1578 “ORACLE data block corrupted (file # %s, block # %s)” Master Note
OERR: ORA-1578 “ORACLE data block corrupted (file # %s, block # %s)”
Fractured Block explanation
Handling Oracle Block Corruptions in Oracle7/8/8i/9i/10g/11g
Diagnosing and Resolving 1578 reported on a Local Index of a Partitioned table
ORA-1410 This error is raised when an operation refers to a ROWID in a table for which there is no such row.
The reference to a ROWID may be implicit from a WHERE CURRENT OF clause or directly from a WHERE ROWID=… clause.
ORA 1410 indicates the ROWID is for a BLOCK that is not part of this table.
Row
Understanding The ORA-1410
Summary Of Bugs Containing ORA 1410
OERR: ORA 1410 “invalid ROWID”
ORA-8103 The object has been deleted by another user since the operation began.
If the error is reproducible, following may be the reasons:-
a.) The header block has an invalid block type.
b.) The data_object_id (seg/obj) stored in the block is different than the data_object_id stored in the segment header. See dba_objects.data_object_id and compare it to the decimal value stored in the block (field seg/obj).
Block
ORA-8103 Troubleshooting, Diagnostic and Solution
OERR: ORA-8103 “object no longer exists” / Troubleshooting, Diagnostic and Solution
ORA-8102 An ORA-08102 indicates that there is a mismatch between the key(s) stored in the index and the values stored in the table. What typically happens is the index is built and at some future time, some type of corruption occurs, either in the table or index, to cause the mismatch. Index
OERR ORA-8102 “index key not found, obj# %s, file %s, block %s (%s)
ORA-1499 An error occurred when validating an index or a table using the ANALYZE command.
One or more entries does not point to the appropriate cross-reference.
Index
ORA-1499. Table/Index row count mismatch
OERR: ORA-1499 table/Index Cross Reference Failure – see trace file
ORA-1498 Generally this is a result of an ANALYZE … VALIDATE … command.
This error generally manifests itself when there is inconsistency in the data/Index block. Some of the block check errors that may be found:-
a.) Row locked by a non-existent transaction
b.) The amount of space used is not equal to block size
c.) Transaction header lock count mismatch.
While support are processing the tracefile it may be worth the re-running the ANALYZE after restarting the database to help show if the corruption is consistent or if it ‘moves’.
Send the tracefile to support for analysis.
If the ANALYZE was against an index you should check the whole object. Eg: Find the tablename and execute:
ANALYZE TABLE xxx VALIDATE STRUCTURE CASCADE;
Block
OERR: ORA 1498 “block check failure – see trace file”
ORA-26040 Trying to access data in block that was loaded without redo generation using the NOLOGGING/UNRECOVERABLE option.
This Error raises always together with ORA-1578
Block
OERR ORA-26040 Data block was loaded using the NOLOGGING option
ORA-1578 / ORA-26040 Corrupt blocks by NOLOGGING – Error explanation and solution
ORA-1578 ORA-26040 in a LOB segment – Script to solve the errors
ORA-1578 ORA-26040 in 11g for DIRECT PATH with NOARCHIVELOG even if LOGGING is enabled
ORA-1578 ORA-26040 On Awr Table
Errors ORA-01578, ORA-26040 On Standby Database
Workflow Tables ORA-01578 ORACLE data block corrupted ORA-26040 Data block was loaded using the NOLOGGING option
ORA-1578, ORA-26040 Data block was loaded using the NOLOGGING option
ORA-600[12700] Oracle is trying to access a row using its ROWID, which has been obtained from an index.
A mismatch was found between the index rowid and the data block it is pointing to. The rowid points to a non-existent row in the data block. The corruption can be in data and/or index blocks.
ORA-600 [12700] can also be reported due to a consistent read (CR) problem.
Consistent Read
Resolving an ORA-600 [12700] error in Oracle 8 and above.
ORA-600 [12700] “Index entry Points to Missing ROWID”
ORA-600[3020] This is called a ‘STUCK RECOVERY’.
There is an inconsistency between the information stored in the redo and the information stored in a database block being recovered.
Redo
ORA-600 [3020] “Stuck Recovery”
Information Required for Root Cause Analysis of ORA-600 [3020] (stuck recovery)
ORA-600[4194] A mismatch has been detected between Redo records and rollback (Undo) records.
We are validating the Undo record number relating to the change being applied against the maximum undo record number recorded in the undo block.
This error is reported when the validation fails.
Undo
ORA-600 [4194] “Undo Record Number Mismatch While Adding Undo Record”
Basic Steps to be Followed While Solving ORA-00600 [4194]/[4193] Errors Without Using Unsupported parameter
ORA-600[4193] A mismatch has been detected between Redo records and Rollback (Undo) records.
We are validating the Undo block sequence number in the undo block against the Redo block sequence number relating to the change being applied.
This error is reported when this validation fails.
Undo
ORA-600 [4193] “seq# mismatch while adding undo record”
Basic Steps to be Followed While Solving ORA-00600 [4194]/[4193] Errors Without Using Unsupported parameter
Ora-600 [4193] When Opening Or Shutting Down A Database
ORA-600 [4193] When Trying To Open The Database
ORA-600[4137] While backing out an undo record (i.e. at the time of rollback) we found a transaction id mis-match indicating either a corruption in the rollback segment or corruption in an object which the rollback segment is trying to apply undo records on.
This would indicate a corrupted rollback segment.
Undo/Redo
ORA-600 [4137] “XID in Undo and Redo Does Not Match”
ORA-600[6101] Not enough free space was found when inserting a row into an index leaf block during the application of undo. Index
ORA-600 [6101] “insert into leaf block (undo)”
ORA-600[2103] Oracle is attempting to read or update a generic entry in the control file.
If the entry number is invalid, ORA-600 [2130] is logged.
Control File
ORA-600 [2130] “Attempt to access non-existant controlfile entry”
ORA-600[4512] Oracle is checking the status of transaction locks within a block.
If the lock number is greater than the number of lock entries, ORA-600 [4512] is reported followed by a stack trace, process state and block dump.
This error possibly indicates a block corruption.
Block
ORA-600 [4512] “Lock count mismatch”
ORA-600[2662] A data block SCN is ahead of the current SCN.
The ORA-600 [2662] occurs when an SCN is compared to the dependent SCN stored in a UGA variable.
If the SCN is less than the dependent SCN then we signal the ORA-600 [2662] internal error.
Block
ORA-600 [2662] “Block SCN is ahead of Current SCN”
ORA 600 [2662] DURING STARTUP
ORA-600[4097] We are accessing a rollback segment header to see if a transaction has been committed.
However, the xid given is in the future of the transaction table.
This could be due to a rollback segment corruption issue OR you might be hitting the following known problem.
Undo
ORA-600 [4097] “Corruption”
ORA-600[4000] It means that Oracle has tried to find an undo segment number in the dictionary cache and failed. Undo
ORA-600 [4000] “trying to get dba of undo segment header block from usn”
ORA-600[6006] Oracle is undoing an index leaf key operation. If the key is not found, ORA-00600 [6006] is logged.
ORA-600[6006] is usually caused by a media corruption problem related to either a lost write to disk or a corruption on disk.
Index
ORA-600 [6006]
ORA-600[4552] This assertion is raised because we are trying to unlock the rows in a block, but receive an incorrect block type.
The second argument is the block type received.
Block
ORA-600 [4555]
ORA-600[6856] Oracle is checking that the row slot we are about to free is not already on the free list.
This internal error is raised when this check fails.
Row
ORA-600 [6856] “Corrupt Block When Freeing a Row Slot
ORA-600[13011] During a delete operation we are deleting from a view via an instead-of trigger or an Index organized table and have exceeded a 5000 pass count when we raise this exception. Row
ORA-600 [13011] “Problem occurred when trying to delete a row”
ORA-600[13013] During the execution of an UPDATE statement, after several attempts (Arg [a] passcount) we are unable to get a stable set of rows that conform to the WHERE clause. Row
ORA-600 [13013] “Unable to get a Stable set of Records”
How to resolve ORA-00600 [13013], [5001]
ORA-600[13030]
ORA-600 [13030]
ORA-600[25012] We are trying to generate the absolute file number given a tablespace number and relative file number and cannot find a matching file number or the file number is zero. afn/rdba/tsn
ORA-600 [25012] “Relative to Absolute File Number Conversion Error”
ORA-600[25026] Looking up/checking a tablespace
invalid tablespace ID and/or rdba found
afn/rdba/tsn
ORA-600 [25026]
ORA-600[25027] Invalid tsn and/or rfn found afn/rdba/tsn
ORA-600 [25027]
ORA-600[kcbz_check_objd_typ] An object block buffer in memory is checked and is found to have the wrong object id. This is most likely due to corruption. Buffer Cache
ORA-600 [kcbz_check_objd_typ_3]
ORA-600 [kcbz_check_objd_typ]
ORA-600[kddummy_blkchkORA-600[kdblkcheckerror] ORA-600[kddummy_blkchk] is for 10.1/10.2 and ORA-600[kdblkcheckerror] for 11 onwards. Block
ORA-600 [kddummy_blkchk]
How to Resolve ORA-00600[kddummy_blkchk]
ORA-600 [kdblkcheckerror]
QREF – kddummy_blkchk / kdBlkCheckError – Check Codes Listing (Full) [This section is not visible to customers.]
QREF – kddummy_blkchk / kdBlkCheckError – Check Codes Definition && Return Values[This section is not visible to customers.]
ORA-600[ktadrprc-1] Dictionary
ORA-600 [ktadrprc-1]
ORA-600[ktsircinfo_num1] This exception occurs when there are problems obtaining the row cache information correctly from sys.seg$. In most cases there is no information in sys.seg$. Dictionary
ORA-600 [ktsircinfo_num1]
ORA-600[qertbfetchbyrowid] Row
ORA-600 [qertbfetchbyrowid]
ORA-600[ktbdchk1-bad dscn] This exception is raised when we are performing a sanity check on the dependent SCN and fail.
The dependent scn is greater than the current scn.
Dictionary
ORA-600 [ktbdchk1: bad dscn]

 

 

Oracleデータリカバリ:ORA-00600:[4000] ORA-00704: bootstrap process failureエラ一例

プロのOracle Databaseの復旧サービスを提供
携帯番号: +86 13764045638 メール:service@parnassusdata.com

 

友達のために、電源が切れたからまともに起動できなくなった10.2.0.1データベースを起動した。そのリポジトリはアーカイブモードではない上に、何のバックアップもない。

電源が切れた後に、データベースインスタンスを再起動したいが、ORA-00600:[kccpb_sanity_check_2]内部エラが起こった:

SQL> select status from v$instance;

STATUS
————
STARTED

SQL>
SQL> shutdown immediate;
ORA-01507: database not mounted

ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.

Total System Global Area 2147483648 bytes
Fixed Size 1220432 bytes
Variable Size 486539440 bytes
Database Buffers 1644167168 bytes
Redo Buffers 15556608 bytes
ORA-00600: internal error code, arguments: [kccpb_sanity_check_2], [8198],
[8175], [0x0], [], [], [], []

そのkccpb_sanity_check_2内部エラは間違えたcontrol fileのseq#記録に引き起こされた。MOS note <ORA-00600: [kccpb_sanity_check_2] During Instance Startup>にこのようなトラブルを説明した:

ORA-00600: [kccpb_sanity_check_2] During Instance Startup

Applies to:
Oracle Server – Enterprise Edition – Version: 10.2.0.1 and later [Release: 10.2 and later ]
Information in this document applies to any platform.
Symptoms
The database is getting the following errors on Startup:

ORA-00600: internal error code, arguments: [kccpb_sanity_check_2], [3621501], [3621462], [0x000000000]

Changes
In this case, the customer moved the box from one data center to another.
Cause
ORA-600 [kccpb_sanity_check_2] indicates that the seq# of the last read block is
higher than the seq# of the control file header block. This is indication of
the lost write of the header block during commit of the previous cf
transaction.

Solution

1) restore a backup of a controlfile and recover

OR

2) recreate the controlfile

OR

3) restore the database from last good backup and recover

NOTE: If you do not have any special backup of control file to restore and you are
using Multiple Control File copies in your pfile/init.ora/spfile you can attempt to mount the
database using each control file one by one.
If you are able to mount the database with any of these control file copies
you can then issue ‘alter database backup controlfile to trace’ to recreate controlfile.

 
control file corruptionが引き起こしたORA-00600: [kccpb_sanity_check_2] トラブルに対して、以下の三つの解決策を試してください:
1. restore controlfile from backup
2. 複数のcontrolfileがある場合に、各control fileでmountできたインスタンスを参考してください。成功できれば、使用可能なcontrolfileをほかの場所へコピできる
3. controlfile backup traceでコントロールファイルを再構造する、traceがなくとも人工的にcreate controlfile文で再構造できる

このようなトラブルが起こったら、二つ目の方法を試したが、失敗した。udump/bdumpディレクトリの下で、使用可能なcontrolfile backup traceを検索するしかない。幸いにそれを見つけた。そしてコントロールファイルを再構造することが成功した。:
Create controlfile reuse set database “kmcdb”
MAXINSTANCES 8
MAXLOGHISTORY 1
MAXLOGFILES 16
MAXLOGMEMBERS 3
MAXDATAFILES 100
Datafile
‘/u01/oracle/oradata/kmcdb/system01.dbf’,
‘/u01/oracle/oradata/kmcdb/undotbs01.dbf’,
‘/u01/oracle/oradata/kmcdb/sysaux01.dbf’,
‘/u01/oracle/oradata/kmcdb/users01.dbf’,
‘/u01/oracle/oradata/kmcdb/example01.dbf’
LOGFILE GROUP 1 (‘/u01/oracle/oradata/kmcdb/redo01.log’) SIZE 51200K,
GROUP 2 (‘/u01/oracle/oradata/kmcdb/redo02.log’) SIZE 51200K,
GROUP 3 (‘/u01/oracle/oradata/kmcdb/redo03.log’) SIZE 51200K RESETLOGS
Wed Sep 28 16:57:47 2011
WARNING: Default Temporary Tablespace not specified in CREATE DATABASE command
Default Temporary Tablespace will be necessary for a locally managed database in future release
Setting recovery target incarnation to 1
Wed Sep 28 16:57:48 2011
Successful mount of redo thread 1, with mount id 523482251
Wed Sep 28 16:57:48 2011
Completed: Create controlfile reuse set database “kmcdb”
MAXINSTANCES 8
MAXLOGHISTORY 1
MAXLOGFILES 16
MAXLOGMEMBERS 3
MAXDATAFILES 100
Datafile
‘/u01/oracle/oradata/kmcdb/system01.dbf’,
‘/u01/oracle/oradata/kmcdb/undotbs01.dbf’,
‘/u01/oracle/oradata/kmcdb/sysaux01.dbf’,
‘/u01/oracle/oradata/kmcdb/users01.dbf’,
‘/u01/oracle/oradata/kmcdb/example01.dbf’
LOGFILE GROUP 1 (‘/u01/oracle/oradata/kmcdb/redo01.log’) SIZE 51200K,
GROUP 2 (‘/u01/oracle/oradata/kmcdb/redo02.log’) SIZE 51200K,
GROUP 3 (‘/u01/oracle/oradata/kmcdb/redo03.log’) SIZE 51200K RESETLOGS
Wed Sep 28 16:57:57 2011
alter database mount

インスタンスを成功にmountできたら、データベースをリカバリして、再起動してください :

recover database using backup controlfile until cancel;

controlfileが再構造されたから、今のcontrolfileはどれのオンラインログがcurrentだと知らなかったから、人工的に指定してください。

ALTER DATABASE RECOVER database using backup controlfile until cancel
Wed Sep 28 17:27:09 2011
Media Recovery Start
WARNING! Recovering data file 1 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 2 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 3 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 4 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 5 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
parallel recovery started with 7 processes
ORA-279 signalled during: ALTER DATABASE RECOVER database using backup controlfile until cancel …
Wed Sep 28 17:27:20 2011
ALTER DATABASE RECOVER LOGFILE ‘/u01/oracle/oradata/kmcdb/redo02.log’
Wed Sep 28 17:27:20 2011
Media Recovery Log /u01/oracle/oradata/kmcdb/redo02.log
Errors with log /u01/oracle/oradata/kmcdb/redo02.log
ORA-339 signalled during: ALTER DATABASE RECOVER LOGFILE ‘/u01/oracle/oradata/kmcdb/redo02.log’ …
Wed Sep 28 17:27:20 2011
ALTER DATABASE RECOVER CANCEL
ORA-1547 signalled during: ALTER DATABASE RECOVER CANCEL …
Wed Sep 28 17:27:37 2011
ALTER DATABASE RECOVER database using backup controlfile until cancel
Wed Sep 28 17:27:37 2011
Media Recovery Start
WARNING! Recovering data file 1 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 2 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 3 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 4 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 5 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
parallel recovery started with 7 processes
ORA-279 signalled during: ALTER DATABASE RECOVER database using backup controlfile until cancel …
Wed Sep 28 17:27:52 2011
ALTER DATABASE RECOVER LOGFILE ‘/u01/oracle/oradata/kmcdb/redo03.log’
Wed Sep 28 17:27:52 2011
Media Recovery Log /u01/oracle/oradata/kmcdb/redo03.log
Wed Sep 28 17:27:52 2011
Incomplete recovery applied all redo ever generated.
Recovery completed through change 16228646
Wed Sep 28 17:27:52 2011
Media Recovery Complete (kmcdb)
Completed: ALTER DATABASE RECOVER LOGFILE ‘/u01/oracle/oradata/kmcdb/redo03.log’
Wed Sep 28 17:28:12 2011

何度も試した後に、redo03.logは必要とされているcurrent group memberだから、alter database open resetlogsしてください。けど、またORA-00600: internal error code, arguments: [4000], [7], [], [], [], [], [], []エラになる:

alter database open resetlogs
Wed Sep 28 17:28:12 2011
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 16228646
Resetting resetlogs activation ID 523468297 (0x1f337e09)
Online log /u01/oracle/oradata/kmcdb/redo01.log: Thread 1 Group 1 was previously cleared
Online log /u01/oracle/oradata/kmcdb/redo02.log: Thread 1 Group 2 was previously cleared
Wed Sep 28 17:28:12 2011
Setting recovery target incarnation to 5
Wed Sep 28 17:28:12 2011
Assigning activation ID 523485789 (0x1f33c25d)
Thread 1 opened at log sequence 1
Current log# 3 seq# 1 mem# 0: /u01/oracle/oradata/kmcdb/redo03.log
Successful open of redo thread 1
Wed Sep 28 17:28:12 2011
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Sep 28 17:28:12 2011
SMON: enabling cache recovery
Wed Sep 28 17:28:12 2011
Errors in file /u01/oracle/admin/kmcdb/udump/kmcdb_ora_31798.trc:
ORA-00600: internal error code, arguments: [4000], [7], [], [], [], [], [], []
Wed Sep 28 17:28:14 2011
Errors in file /u01/oracle/admin/kmcdb/udump/kmcdb_ora_31798.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [7], [], [], [], [], [], []
以上のORA-00600: [4000], [7],はusn=のロールバックセグメント時にみつけられる。ブロックをロールバックする時に、ロールバックセグメントにトラブルが起こった。それに、ORA-00704: bootstrap process failureエラも同時に起こった。 これはロールバックする必要があるデータブロックはbootstrapにとって、大切なブートストラップ・オブジェクトである。
一般的にbootstrap objectはデータをrollback、cleanupあるいはapply undoするときに、_corrupted_rollback_segments,_offline_rollback_segmentsあるいは10513トランザクションでORA-00600: [4000]を阻止できない。

まだ試したいなら、_corrupted_rollback_segments,_offline_rollback_segmentsとeventバラメタを修正して、試してください:
alter system set event=’10513 trace name context forever,level 2 :
10512 trace name context forever,level 1: 10511 trace name context forever,level 2:
10510 trace name context forever,level 1′ scope=spfile;

ALTER SYSTEM SET _offline_rollback_segments='(_SYSSMU7$)’ SCOPE=SPFILE;

ALTER SYSTEM SET _corrupted_rollback_segments='(_SYSSMU7$)’ SCOPE=SPFILE;

再びopen database

alter database open resetlogs
Wed Sep 28 17:47:05 2011
RESETLOGS after complete recovery through change 16228668
Resetting resetlogs activation ID 523471415 (0x1f338a37)
Online log /u01/oracle/oradata/kmcdb/redo01.log: Thread 1 Group 1 was previously cleared
Online log /u01/oracle/oradata/kmcdb/redo02.log: Thread 1 Group 2 was previously cleared
Wed Sep 28 17:47:05 2011
Setting recovery target incarnation to 11
Wed Sep 28 17:47:05 2011
Assigning activation ID 523466708 (0x1f3377d4)
Thread 1 opened at log sequence 1
Current log# 3 seq# 1 mem# 0: /u01/oracle/oradata/kmcdb/redo03.log
Successful open of redo thread 1
Wed Sep 28 17:47:05 2011
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Sep 28 17:47:05 2011
SMON: enabling cache recovery
Wed Sep 28 17:47:05 2011
Errors in file /u01/oracle/admin/kmcdb/udump/kmcdb_ora_32261.trc:
ORA-00600: internal error code, arguments: [4000], [7], [], [], [], [], [], []
Wed Sep 28 17:47:06 2011
Errors in file /u01/oracle/admin/kmcdb/udump/kmcdb_ora_32261.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [7], [], [], [], [], [], []
Wed Sep 28 17:47:06 2011
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 32261
ORA-1092 signalled during: alter database open resetlogs…
BootstrapがORA-00600:[4000]エラになることを避けられない。これはデータベースを起動することに対して、致命的である。このデータベースをリカバリするにはbootstrapのトラブルを解決する必要がある。
どうしよう?
まずはORA-00600:[4000]内部的なエラtraceログを読んでください:

*** SERVICE NAME:() 2011-09-28 17:46:09.022
*** SESSION ID:(160.3) 2011-09-28 17:46:09.022
Recovery target incarnation = 10, activation ID = 0
Influx buffer limit = 98098 (50% x 196196)
Successfully allocated 7 recovery slaves
Using 156 overflow buffers per recovery slave
Start recovery at thread 1 ckpt scn 16228666 logseq 1 block 2
*** 2011-09-28 17:46:09.236
Media Recovery add redo thread 1
*** 2011-09-28 17:46:14.752
Media Recovery drop redo thread 1
*** 2011-09-28 17:46:41.788
Recovery target incarnation = 10, activation ID = 0
Influx buffer limit = 98098 (50% x 196196)
Successfully allocated 7 recovery slaves
Using 156 overflow buffers per recovery slave
Start recovery at thread 1 ckpt scn 16228666 logseq 1 block 2
*** 2011-09-28 17:46:41.842
Media Recovery add redo thread 1
*** 2011-09-28 17:46:52.799
Media Recovery Log /u01/oracle/oradata/kmcdb/redo03.log
—– Redo read statistics for thread 1 —–
Read rate (ASYNC): 0Kb in 10.99s => 0.00 Mb/sec
Total physical reads: 4096Kb
Longest record: 0Kb, moves: 0/1 (0%)
Change moves: 0/1 (0%), moved: 0Mb
Longest LWN: 0Kb, moves: 0/1 (0%), moved: 0Mb
Last redo scn: 0x0000.00f7a13b (16228667)
———————————————-
*** 2011-09-28 17:46:52.835
Media Recovery drop redo thread 1
File 1 (stop scn 16228668) completed recovery at checkpoint scn 16228668
File 2 (stop scn 16228668) completed recovery at checkpoint scn 16228668
File 3 (stop scn 16228668) completed recovery at checkpoint scn 16228668
File 4 (stop scn 16228668) completed recovery at checkpoint scn 16228668
File 5 (stop scn 16228668) completed recovery at checkpoint scn 16228668
ARCH: Connecting to console port…
*** 2011-09-28 17:47:05.351
Prior to RESETLOGS processing…
ALTER SYSTEM ARCHIVE LOG ALL USING BACKUP CONTROLFILE start
Database is not in archivelog mode
ALTER SYSTEM ARCHIVE LOG ALL USING BACKUP CONTROLFILE complete
*** 2011-09-28 17:47:05.353
*** 2011-09-28 17:47:05.707
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4000], [7], [], [], [], [], [], []
Current SQL statement for this session:
select ctime, mtime, stime from obj$ where obj# = :1
—– Call Stack Trace —–
calling call entry argument values in hex
location type point (? means dubious value)
——————– ——– ——————– —————————-
ksedst()+27 call ksedst1() 0 ? 1 ?
ksedmp()+557 call ksedst() 0 ? 9EF2E6D0 ? 0 ? 2A ?
9CCB06F4 ? 70000 ?
ksfdmp()+19 call ksedmp() 3 ? BFAA18D0 ? AC152A0 ?
CBD2D40 ? 3 ? CB843B8 ?
kgeriv()+188 call 00000000 CBD2D40 ? 3 ?
kgeasi()+113 call kgeriv() CBD2D40 ? B7EB0020 ? FA0 ?
1 ? BFAA190C ?
ktudba()+264 call kgeasi() CBD2D40 ? B7EB0020 ? FA0 ?
2 ? 1 ? 0 ? 7 ? 0 ?
ktrgcm()+6207 call ktudba() 7 ? BFAA1DEC ? 0 ? 0 ?
ktrgtc()+941 call ktrgcm() B7EFF4F4 ? 0 ? B7EFE054 ?
96F1A0B4 ? 96F10CE8 ? 198 ?
kdsgrp()+107 call ktrgtc() B7EFF4F4 ? B7EFF49C ?
9C22142 ? BFAA1F08 ? 240 ?
9C24DC4 ? 9C21D7C ?
kdsfbrcb()+513 call kdsgrp() B7EFF4F0 ? 0 ? B7EFF4F0 ?
qertbFetchByRowID() call kdsfbrcb() B7EFF4F0 ? B7EFDFEC ? 0 ? 1 ?
+2052 0 ? 0 ?
opifch2()+5157 call 00000000 96F10A8C ? A11CDE4 ?
BFAA2534 ? 1 ?

引发ORA-00600:[4000]错误的数据块头信息:

Block header dump: 0x0040007a
Object id on Block? Y
seg/obj: 0x12 csc: 0x00.f84ef0 itc: 1 flg: – typ: 1 – DATA
fsl: 0 fnx: 0x0 ver: 0x01

Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x0007.01f.000011cf 0x00800057.0faf.07 –U- 1 fsc 0x0000.00f84ef1

data_block_dump,data header at 0x82bba044
以上的traceファイルの情報量は極多すぎて、一つ一つで見極めて文件信息量极大,我们来抽丝剥茧的罗列这些信息量:
1.
File 1 (stop scn 16228668) completed recovery at checkpoint scn 16228668,、このログは1号データファイルが最後のcheckpoint scnにリカバリできると意味している。データベースを起動してみるときに、今のscnが16228668からはじまる。
2.
ORA-00600: internal error code, arguments: [4000], [7], [], [], [], [], [], []
Current SQL statement for this session:
select ctime, mtime, stime from obj$ where obj# = :1
ORA-00600:[4000], [7]エラを引き起こした文は”select ctime, mtime, stime from obj$ where obj# = :1″で、これはよく使われるアーカイブSQL文。
3. ORA-00600:[4000] raisedを引き起こしたのはktudba関数である。そのstack callはkdsgrp-> ktrgtc -> ktrgcm -> ktudba -> kgeasi(エラ処理関数)
4. ORA-00600:[4000], [7]エラを引き起こしたデータブロックは1号データファイルの122ブロックで、seg/objは0x12ブロックタイプはDataである。それに、一つのITL entryが存在している:
Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x0007.01f.000011cf 0x00800057.0faf.07 –U- 1 fsc 0x0000.00f84ef1
そのITLのScnは00f84ef1= 16273137 > current_scn=16228668 且lck=1
ここでは一つのトラブルが起こった:
なぜこの”select ctime, mtime, stime from obj$ where obj# = :1″クエリ文がそのOBJ$のデータブロックのに対して、ITLに関するロールバックセグメントをアクセスするか
ここでOracleにクエリ文でCR一致にブロックを読み取るブロックとブロックディレイをクリンアップするブロックを構造することに関する知識を紹介する:
1. 既存するbuffer blockがクエリのSCN要求に満たせないときに、クエリプロセスは既存するbufferとundo情報でCRブロックを構造できる。例えばcacheで既存するブロックのSCNは10だが、クエリ文が実にScn=1の時に始まったから。このクエリ文は1である。ではこのクエリ文のSnap_scnは1で、SCNの要求を満たすために、関するrollback segmentをアクセスして、その buffer blockにロールバックする必要がある。
2. 一方、クエリ文がブロックをアクセスするときに、ブロックをcleanupしてください。つまり、ディレイブロックを削除するということ(deferred block cleanout)。このようなブロックをクリンアウトするときに、クエリプロセスもそれについてのITLに関するrollback segmentをアクセスする必要がある。では、どこのブロックにアクセスするときにクリンアップする必要があるか?次の内容に紹介する。
また、ORA-00600:[4000]の話に戻ろう。そのエラに関わる対象は大事なBootstrapテーブルOBJ$で伝統的な方法を使えない:_corrupted_rollback_segments,_offline_rollback_segmentsあるいは10513トランザクションでORA-00600: [4000]を阻止できる。ブロック修正ツールBBEDでトラブルがあったデータブロックのITLトランザクションのFLAGからUからCへ(Commit)。
TRANSACTION_COMMITED = 0x08;
TRANSACTION_UPBOUND = 0x02;
TRANSACTION_ACTIVE = 0x01;
Flag= –U- つまりTRANSACTION_UPBOUNDの時にflagが占めているバイトが0x02でそのバイトをTRANSACTION_COMMITED = 0x08に修正してください;
あいにく、QQでモートアシスタンスでBBEDでデータブロックを起動したが、Terminalツールにトラブルがあるから。ログは不完全で、とくにBBEDの部分:
[oracle@DBserver1 lib]$ cd /u01/oracle/oradata/kmcdb/
[oracle@DBserver1 kmcdb]$ ls
control01.ctl control02.ctl.bak ctl_backup redo02.log system01.dbf users01.dbf
control01.ctl.bak control03.ctl example01.dbf redo03.log temp01.dbf
control02.ctl control03.ctl.bak redo01.log sysaux01.dbf undotbs01.dbf
[oracle@DBserver1 kmcdb]$ cp system01.dbf ctl_backup/
[oracle@DBserver1 kmcdb]$ cp system01.dbf system01.dbf.bak

正式にBBEDする前に、すべてのデータファイルをバックアップしてください。

oracle@DBserver1 kmcdb]$ bbed filename=system01.dbf password=blockedit mode=edit

BBED: Release 2.0.0.0.0 – Limited Production on Wed Sep 28 21:22:41 2011

Copyright (c) 1982, 2005, Oracle. All rights reserved.

BBED> set blocksize 8192
BLOCKSIZE 8192

BBED> set block 123
BLOCK# 123

BBED> map /v
File: system01.dbf (0)
Block: 123 Dba:0x00000000
————————————————————
KTB Data Block (Table/Cluster)

struct kcbh, 20 bytes @0
ub1 type_kcbh @0
ub1 frmt_kcbh @1
ub1 spare1_kcbh @2
ub1 spare2_kcbh @3
ub4 rdba_kcbh @4
ub4 bas_kcbh @8
ub2 wrp_kcbh @12
ub1 seq_kcbh @14
ub1 flg_kcbh @15
ub2 chkval_kcbh @16
ub2 spare3_kcbh @18

struct ktbbh, 48 bytes @20
ub1 ktbbhtyp @20
union ktbbhsid, 4 bytes @24
struct ktbbhcsc, 8 bytes @28
b2 ktbbhict @36
ub1 ktbbhflg @38
ub1 ktbbhfsl @39
ub4 ktbbhfnx @40
struct ktbbhitl[1], 24 bytes @44

struct kdbh, 14 bytes @68
ub1 kdbhflag @68
b1 kdbhntab @69
b2 kdbhnrow @70
sb2 kdbhfrre @72
sb2 kdbhfsbo @74
sb2 kdbhfseo @76
b2 kdbhavsp @78
b2 kdbhtosp @80

struct kdbt[1], 4 bytes @82
b2 kdbtoffs @82
b2 kdbtnrow @84

sb2 kdbr[108] @86

ub1 freespace[873] @302

ub1 rowdata[7013] @1175

ub4 tailchk @8188

ITL记录从 struct ktbbhitl[1], 24 bytes @44 开始

BBED> set offset 61
OFFSET 61

BBED> d
File: system01.dbf (0)
Block: 123 Offsets: 61 to 572 Dba:0x00000000
————————————————————————
200000f1 4ef80000 016c00ff ffea0053 04690369 0300006c 007b1f3a 1ffe1ec1
1e801e40 1ef81db8 1d771d29 1dec1cb0 1c681c27 1ce71ba7 1b681b27 1beb1aaf
1a701a2f 1af319b4 19731937 19f118ae 18671827 18e717a7 1766172b 17e316a2
1661161f 16e315a3 15661529 15e914a8 146c142c 14ec13b0 13731332 13f112b0
126f122f 12ee11a0 11601121 11de109b 105f1023 10e60fa3 0f620f1d 0fd80e9c
0e5a0e14 0ed30d94 0d540d13 0dd30c93 0c570c17 0cd90b98 0b580b15 0bd00a8b
0a460a01 0ac10981 09410901 09c0087d 083e0801 08bc077e 074207fc 06bb0674
063206f0 05aa0564 051e05d8 04950453 04000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

BBED> modify /x 0x80
#modify /x 20 filename ‘system01.dbf’ block 123. offset 61.
sum apply
verify

そのITLトランザクションのlck=1で、lb: 0x1のrow piece記録をlb: 0x0に修正してください。

Traceでその記録は
tab 0, row 26, @0x18f1
tl: 70 fb: –H-FL– lb: 0x1 cc: 17
col 0: [ 2] c1 02
col 1: [ 4] c3 08 61 2c
col 2: [ 1] 80
col 3: [12] 5f 4e 45 58 54 5f 4f 42 4a 45 43 54
col 4: [ 2] c1 02
col 5: *NULL*

set offset 6454

modify /x 0x00
#modify /x 01 filename ‘system01.dbf’ block 123. offset 6454.
sum apply
verify

以上はOBJ$に対するデータブロックの修正が完成した。

データベースを再起動したが、いろんなブロックを見つけた:

alter database open resetlogs
Wed Sep 28 21:42:53 2011
RESETLOGS after complete recovery through change 16228672
Resetting resetlogs activation ID 523466708 (0x1f3377d4)
Online log /u01/oracle/oradata/kmcdb/redo01.log: Thread 1 Group 1 was previously cleared
Online log /u01/oracle/oradata/kmcdb/redo02.log: Thread 1 Group 2 was previously cleared
Wed Sep 28 21:42:54 2011
Setting recovery target incarnation to 12
Wed Sep 28 21:42:54 2011
Assigning activation ID 523426694 (0x1f32db86)
Thread 1 opened at log sequence 1
Current log# 3 seq# 1 mem# 0: /u01/oracle/oradata/kmcdb/redo03.log
Successful open of redo thread 1
Wed Sep 28 21:42:54 2011
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Sep 28 21:42:54 2011
SMON: enabling cache recovery
Wed Sep 28 21:42:54 2011
Dictionary check beginning
Tablespace ‘TEMP’ #3 found in data dictionary,
but not in the controlfile. Adding to controlfile.
Corrected file 5 plugged in read-only status in control file
Dictionary check complete
Wed Sep 28 21:42:54 2011
SMON: enabling tx recovery
Wed Sep 28 21:42:54 2011
*********************************************************************
WARNING: The following temporary tablespaces contain no files.
This condition can occur when a backup controlfile has
been restored. It may be necessary to add files to these
tablespaces. That can be done using the SQL statement:

ALTER TABLESPACE ADD TEMPFILE

Alternatively, if these temporary tablespaces are no longer
needed, then they can be dropped.
Empty temporary tablespace: TEMP
*********************************************************************
Updating character set in controlfile to ZHS16GBK
Wed Sep 28 21:42:54 2011
Errors in file /u01/oracle/admin/kmcdb/bdump/kmcdb_smon_32531.trc:
ORA-00600: internal error code, arguments: [2662], [0], [16273152], [0], [16273158], [4220957], [], []
**********************************************************
WARNING: Files may exists in db_recovery_file_dest
that are not known to the database. Use the RMAN command
CATALOG RECOVERY AREA to re-catalog any such files.
One of the following events caused this:
1. A backup controlfile was restored.
2. A standby controlfile was restored.
3. The controlfile was re-created.
4. db_recovery_file_dest had previously been enabled and
then disabled.
**********************************************************
replication_dependency_tracking turned off (no async multimaster replication found)
WARNING: AQ_TM_PROCESSES is set to 0. System operation might be adversely affected.
Wed Sep 28 21:42:55 2011
LOGSTDBY: Validating controlfile with logical metadata
Wed Sep 28 21:42:55 2011
LOGSTDBY: Validation complete
ORA-01555 caused by SQL statement below (SQL ID: 96g93hntrzjtr, SCN: 0x0000.00f84f0a):
Wed Sep 28 21:42:55 2011
select /*+ rule */ bucket_cnt, row_cnt, cache_cnt, null_cnt, timestamp#, sample_size, minimum,
maximum, distcnt, lowval, hival, density, col#, spare1, spare2, avgcln from hist_head$ where obj#=:1 and intcol#=:2
Global Name changed to KMCDB
Wed Sep 28 21:42:55 2011
Non-fatal internal error happenned while SMON was doing non-existent object cleanup.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Wed Sep 28 21:42:56 2011
Completed: alter database open resetlogs
Wed Sep 28 21:42:57 2011
Hex dump of (file 1, block 61479) in trace file /u01/oracle/admin/kmcdb/bdump/kmcdb_smon_32531.trc
Corrupt block relative dba: 0x0040f027 (file 1, block 61479)
Fractured block found during buffer read
Data in bad block:
type: 6 format: 2 rdba: 0x0040f027
last change scn: 0x0000.00f80906 seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x36a70601
check value in block header: 0xe5d2
computed block checksum: 0x3fa1
Reread of rdba: 0x0040f027 (file 1, block 61479) found same corrupted data
Wed Sep 28 21:42:57 2011
Errors in file /u01/oracle/admin/kmcdb/bdump/kmcdb_smon_32531.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01578: ORACLE data block corrupted (file # 1, block # 61479)
ORA-01110: data file 1: ‘/u01/oracle/oradata/kmcdb/system01.dbf’
Wed Sep 28 21:42:58 2011
Starting background process CJQ0
CJQ0 started with pid=24, OS id=724
Wed Sep 28 21:43:01 2011
Errors in file /u01/oracle/admin/kmcdb/bdump/kmcdb_m000_720.trc:
ORA-00600: internal error code, arguments: [2662], [0], [16273222], [0], [16273348], [12624044], [], []
Wed Sep 28 21:43:02 2011
Errors in file /u01/oracle/admin/kmcdb/bdump/kmcdb_m000_720.trc:
ORA-00600: internal error code, arguments: [2662], [0], [16273288], [0], [16273348], [12624044], [], []
ORA-00600: internal error code, arguments: [2662], [0], [16273222], [0], [16273348], [12624044], [], []
Wed Sep 28 21:44:35 2011
create temporary tablespace temp02 tempfile ‘/u01/oracle/oradata/temp02.dbf’ size 400M
Wed Sep 28 21:44:35 2011
Completed: create temporary tablespace temp02 tempfile ‘/u01/oracle/oradata/temp02.dbf’ size 400M
Wed Sep 28 21:45:52 2011
create undo tablespace undo03 datafile ‘/u01/oracle/oradata/undo03.dbf’ size 500M
Hex dump of (file 3, block 3839) in trace file /u01/oracle/admin/kmcdb/udump/kmcdb_ora_749.trc
Corrupt block relative dba: 0x00c00eff (file 3, block 3839)
Fractured block found during buffer read
Data in bad block:
type: 6 format: 2 rdba: 0x00c00eff
last change scn: 0x0000.00f7be10 seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x49b20601
check value in block header: 0xe0a3
computed block checksum: 0xf7a2
Reread of rdba: 0x00c00eff (file 3, block 3839) found same corrupted data
Completed: create undo tablespace undo03 datafile ‘/u01/oracle/oradata/undo03.dbf’ size 500M
Wed Sep 28 21:45:58 2011
Errors in file /u01/oracle/admin/kmcdb/bdump/kmcdb_smon_32531.trc:
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Wed Sep 28 21:45:58 2011
Non-fatal internal error happenned while SMON was doing Corrupt Block Seg Info.
SMON encountered 2 out of maximum 100 non-fatal internal errors.
Wed Sep 28 21:47:15 2011
ALTER SYSTEM SET undo_management=’AUTO’ SCOPE=SPFILE;
Wed Sep 28 21:47:19 2011
Starting background process EMN0
EMN0 started with pid=22, OS id=757
Wed Sep 28 21:47:19 2011
Shutting down instance: further logons disabled
Wed Sep 28 21:47:19 2011
Stopping background process CJQ0
Wed Sep 28 21:47:19 2011

新しいUndoテーブルスペースを作成して、切り替えて、持続的現れるORA-600トラブルを避けられる。
以上のalert.logで2662と4097内部的なエラはデータベースをcrashさせないが、非常手段で起動したデータベースもう安定していないから、早速にデータをバックアップして、データベースを再構造してください:

run{
set maxcorrupt for datafile 1,2,3,4,5 to 100;
backup as compressed backupset incremental level 0 database;
}

exp system/oracle full=y file=full.dmp

友達はデータベースのデータがとっても大切だと何度も言ったが、その行為がそのデータが実にどうでもいいと示している。
データがすごく大事だと思っているなら、データを大事にしてください。何よりも大事なことはデータバックアップである。バックアップなしにデータが永遠に保てると思わないでください!

ASMがdisk headerをなくした。ORA-15032、ORA-15040、ORA-15042 でDiskgrouがmountできなくなった。

プロのOracle Databaseの復旧サービスを提供
携帯番号: +86 13764045638 メール:service@parnassusdata.com

ASMがdisk headerをなくして、ORA-15032、ORA-15040、ORA-15042で Diskgroupがmountできないインスタンスも少なくない。この文で、どうやって解決できるかを紹介できる。

 

SQL> select * from v$version;

BANNER
——————————————————————————–
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – 64bit Production
PL/SQL Release 11.2.0.3.0 – Production
CORE 11.2.0.3.0 Production
TNS for Linux: Version 11.2.0.3.0 – Production
NLSRTL Version 11.2.0.3.0 – Production
SQL> alter diskgroup datadg mount;
alter diskgroup datadg mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk “5” is missing from group number “1”
ERROR: alter diskgroup datadg mount
Wed Mar 13 07:42:03 2013
SQL> alter diskgroup datadg mount
NOTE: cache registered group DATADG number=1 incarn=0xccb845cd
NOTE: cache began mount (first) of group DATADG number=1 incarn=0xccb845cd
NOTE: Assigning number (1,2) to disk (/dev/asm-diskg)
NOTE: Assigning number (1,1) to disk (/dev/asm-diskf)
NOTE: Assigning number (1,0) to disk (/dev/asm-diske)
Wed Mar 13 16:42:09 2013
NOTE: GMON heartbeating for grp 1
GMON querying group 1 at 20 for pid 27, osid 5439
NOTE: Assigning number (1,5) to disk ()
GMON querying group 1 at 21 for pid 27, osid 5439
NOTE: cache dismounting (clean) group 1/0xCCB845CD (DATADG)
NOTE: messaging CKPT to quiesce pins Unix process pid: 5439, image: oracle@vmac1 (TNS V1-V3)
NOTE: dbwr not being msg’d to dismount
NOTE: lgwr not being msg’d to dismount
NOTE: cache dismounted group 1/0xCCB845CD (DATADG)
NOTE: cache ending mount (fail) of group DATADG number=1 incarn=0xccb845cd
NOTE: cache deleting context for group DATADG 1/0xccb845cd
GMON dismounting group 1 at 22 for pid 27, osid 5439
NOTE: Disk in mode 0x8 marked for de-assignment
NOTE: Disk in mode 0x8 marked for de-assignment
NOTE: Disk in mode 0x8 marked for de-assignment
NOTE: Disk in mode 0x8 marked for de-assignment
ERROR: diskgroup DATADG was not mounted
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk “5” is missing from group number “1”
ERROR: alter diskgroup datadg mount
Wed Mar 13 16:42:10 2013
ASM Health Checker found 1 new failures

 

[grid@vmac1 ~]$ kfed read /dev/asm-diskh
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 0 ; 0x001: 0x00
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 0 ; 0x008: file=0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
7FA1DA233400 00000000 00000000 00000000 00000000 […………….]
Repeat 255 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
col path for a20
set linesize 200 pagesize 1400
select path,header_status,state from v$asm_disk;
PATH HEADER_STATUS STATE
——————– ———————————— ————————
/dev/asm-diskh CANDIDATE NORMAL
/dev/asm-diskg MEMBER NORMAL
/dev/asm-diskf MEMBER NORMAL
/dev/asm-diske MEMBER NORMAL
/dev/asm-diskc MEMBER NORMAL
/dev/asm-diskd MEMBER NORMAL
/dev/asm-diskb MEMBER NORMAL

7 rows selected.

[grid@vmac1 ~]$ kfed repair /dev/asm-diskh
KFED-00320: Invalid block num1 = [0], num2 = [1], error = [endian_kfbh]

 

[grid@vmac1 ~]$ kfed repair /dev/asm-diskh ausz=1048576
KFED-00320: Invalid block num1 = [0], num2 = [1], error = [endian_kfbh]
ASMインスタンスを閉める
トラブルが起こったASMのheaderを優先的にバックアップしてください

dd if=<bad disk> of=<file> bs=4096 count=1

  1. 今のディスクにfile 1 block 1を持っているものを探し出す

[grid@vmac1 ~]$ kfed read /dev/asm-diske |grep f1b1
kfdhdb.f1b1locn: 2 ; 0x0d4: 0x00000002

[grid@vmac1 ~]$
[grid@vmac1 ~]$ kfed read /dev/asm-diskf |grep f1b1
kfdhdb.f1b1locn: 0 ; 0x0d4: 0x00000000

[grid@vmac1 ~]$ kfed read /dev/asm-diskg |grep f1b1
kfdhdb.f1b1locn: 0 ; 0x0d4: 0x00000000

 

ここのasm-diskeにf1b1に0を超えた数値が現れた。file 1 block 1が持っていれば、二つ目のauタイプがKFBTYP_LISTHEADかを確認できる。

[grid@vmac1 ~]$ kfed read /dev/asm-diske aun=2|grep kfbh.type
kfbh.type: 5 ; 0x002: KFBTYP_LISTHEAD

 

 

なくしたディスクにfile 1 block 1 F1B1を含んでいれば、KFBTYP_LISTHEADを見つけ出すまで、そのディスクにあるすべてのAUをスキャンする。LISTHEADを探せない場合に、仕方なく diskgroupを再構造しかない。
バーション11.1.0.7から(10gは10.2.0.5から、なるべく10.2.0.5前のASMを使われないでください)各I/OがASM disk header(AU 0 blocknum 0)へ提出するたびに、 AU 1にコピする。最後の二つのブロックは違ったAU sizeによって、そのブロックの位置も異なる。

Allocation Unit Size Block Number on AU 1
1048576 254
4194304 1022
8388608 2046
16777216 4094

 

まずはkfedでその位置に正確なdisk headerがあるかを確認する。さもなければ、人工的にふさわしいheaderを見つけ出してください。 テーブル1は以下の通り:

[grid@vmac1 ~]$ kfed read /dev/asm-diske ausz=1048576 aun=1 blkn=254|less
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 254 ; 0x004: blk=254
kfbh.block.obj: 2147483648 ; 0x008: disk=0
kfbh.check: 2086475720 ; 0x00c: 0x7c5d17c8
kfbh.fcn.base: 31322 ; 0x010: 0x00007a5a
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
後でkfed repairコマンドでdisk headerをリカバリする

[grid@vmac1 ~]$ kfed repair /dev/asm-diskh ausz=1048576

 
もしdisk headerの自動的なバックアップもなくした場合に、以下のエラになる。

KFED-00320: Invalid block num1 = [0], num2 = [1], error = [endian_kfbh]

 

もし以上のkfed repairが無効であれば、人工的にdisk headerをリカバリしてください:

 

10.2.0.5バーションに対して、kred repairが使用可能な情况は以下の通り:

 

トラブルdisk が同じなdiskgroupにあるが、f1b1ファイルを含むdisk headerを探し出してください。例えばここのasm-diskf
[grid@vmac1 ~]$ kfed read /dev/asm-diskf |grep f1b1
kfdhdb.f1b1locn: 0 ; 0x0d4: 0x00000000

 

kfed read <device name> > fix.txt コマンドでヘッダを格納してください
[grid@vmac1 ~]$ kfed read /dev/asm-diskf > fix.txt
fix.txtを編集し、kfdhdb.dsknum 、kfdhdb.dskname 、 kfdhdb.fgname 三つの情報を修正してください:

同時にalert.logの情報を参考してください:

[grid@vmac1 trace]$ grep “cache opening” alert_+ASM1.log
NOTE: cache opening disk 0 of grp 1: SYSTEDG_0000 path:/dev/asm-diskb
NOTE: cache opening disk 1 of grp 1: SYSTEDG_0001 path:/dev/asm-diskc
NOTE: cache opening disk 2 of grp 1: SYSTEDG_0002 path:/dev/asm-diskd
NOTE: cache opening disk 0 of grp 1: SYSTEDG_0000 path:/dev/asm-diskb
NOTE: cache opening disk 1 of grp 1: SYSTEDG_0001 path:/dev/asm-diskc
NOTE: cache opening disk 2 of grp 1: SYSTEDG_0002 path:/dev/asm-diskd
NOTE: cache opening disk 0 of grp 1: SYSTEDG_0000 path:/dev/asm-diskb
NOTE: cache opening disk 1 of grp 1: SYSTEDG_0001 path:/dev/asm-diskc
NOTE: cache opening disk 2 of grp 1: SYSTEDG_0002 path:/dev/asm-diskd
NOTE: cache opening disk 0 of grp 1: SYSTEDG_0000 path:/dev/asm-diskb
NOTE: cache opening disk 1 of grp 1: SYSTEDG_0001 path:/dev/asm-diskc
NOTE: cache opening disk 2 of grp 1: SYSTEDG_0002 path:/dev/asm-diskd
NOTE: cache opening disk 0 of grp 1: SYSTEDG_0000 path:/dev/asm-diskb
NOTE: cache opening disk 1 of grp 1: SYSTEDG_0001 path:/dev/asm-diskc
NOTE: cache opening disk 2 of grp 1: SYSTEDG_0002 path:/dev/asm-diskd
NOTE: cache opening disk 0 of grp 1: SYSTEDG_0000 path:/dev/asm-diskb
NOTE: cache opening disk 1 of grp 1: SYSTEDG_0001 path:/dev/asm-diskc
NOTE: cache opening disk 2 of grp 1: SYSTEDG_0002 path:/dev/asm-diskd
NOTE: cache opening disk 0 of grp 2: DATADG_0000 path:/dev/asm-diske
NOTE: cache opening disk 1 of grp 2: DATADG_0001 path:/dev/asm-diskf
NOTE: cache opening disk 2 of grp 2: DATADG_0002 path:/dev/asm-diskg
NOTE: cache opening disk 0 of grp 2: DATADG_0000 path:/dev/asm-diske
NOTE: cache opening disk 1 of grp 2: DATADG_0001 path:/dev/asm-diskf
NOTE: cache opening disk 2 of grp 2: DATADG_0002 path:/dev/asm-diskg
NOTE: cache opening disk 0 of grp 1: DATADG_0000 path:/dev/asm-diske
NOTE: cache opening disk 1 of grp 1: DATADG_0001 path:/dev/asm-diskf
NOTE: cache opening disk 2 of grp 1: DATADG_0002 path:/dev/asm-diskg
NOTE: cache opening disk 0 of grp 2: SYSTEDG_0000 path:/dev/asm-diskb
NOTE: cache opening disk 1 of grp 2: SYSTEDG_0001 path:/dev/asm-diskc
NOTE: cache opening disk 2 of grp 2: SYSTEDG_0002 path:/dev/asm-diskd
NOTE: cache opening disk 5 of grp 1: DATADG_0005 path:/dev/asm-diskh

 
原fix.txt中的内容:

[grid@vmac1 ~]$ egrep “dsknum|grptyp|hdrsts|dskname|grpname|fgname” fix.txt
kfdhdb.dsknum: 1 ; 0x024: 0x0001
kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname: DATADG_0001 ; 0x028: length=11
kfdhdb.grpname: DATADG ; 0x048: length=6
kfdhdb.fgname: DATADG_0001 ; 0x068: length=11

 

修正したあと:

[grid@vmac1 ~]$ egrep “dsknum|grptyp|hdrsts|dskname|grpname|fgname” fix.txt
kfdhdb.dsknum: 5 ; 0x024: 0x0005
kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname: DATADG_0005 ; 0x028: length=11
kfdhdb.grpname: DATADG ; 0x048: length=6
kfdhdb.fgname: DATADG_0005 ; 0x068: length=11

 

そしてkfbh.block.objを修正してください
[grid@vmac1 ~]$ grep kfbh.block.obj fix.txt
kfbh.block.obj: 2147483649 ; 0x008: disk=1

2147483649==》0x80000001
0x80000001 最後の一位はASM DISK NUMBERでkfdhdb.dsknum に等しい。ここでは 0x80000005 ==》 2147483653
[grid@vmac1 ~]$ grep kfbh.block.obj fix.txt
kfbh.block.obj: 2147483653 ; 0x008: disk=5
windowsプラットフォームのASMLIBの場合に、kfdhdb.driver.reserved[0]を使う必要があるから、WindowsのASMLIBを使う人があまりいない。

 

 

 

そしてdisk directory を探し出せるために、aunum=2 blknum=2を確認してください。kfed read は f1b1のdiskの aunum=2 blknum=位置を持っているから:

kfed read <device name> aunum=2 blknum=2 | more
[grid@vmac1 ~]$ kfed read /dev/asm-diske|grep f1b1
kfdhdb.f1b1locn: 2 ; 0x0d4: 0x00000002

 

[grid@vmac1 ~]$ kfed read /dev/asm-diske aunum=2 blknum=2|more
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 4 ; 0x002: KFBTYP_FILEDIR
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 2 ; 0x004: blk=2
kfbh.block.obj: 1 ; 0x008: file=1
kfbh.check: 322527999 ; 0x00c: 0x133962ff
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfffdb.node.incarn: 1 ; 0x000: A=1 NUMM=0x0
kfffdb.node.frlist.number: 4294967295 ; 0x004: 0xffffffff

…………………….
kfffde[0].xptr.au: 3 ; 0x4a0: 0x00000003
kfffde[0].xptr.disk: 0 ; 0x4a4: 0x0000
kfffde[0].xptr.flags: 0 ; 0x4a6: L=0 E=0 D=0 S=0
kfffde[0].xptr.chk: 41 ; 0x4a7: 0x29
kfffde[1].xptr.au: 4294967295 ; 0x4a8: 0xffffffff
kfffde[1].xptr.disk: 65535 ; 0x4ac: 0xffff
kfffde[1].xptr.flags: 0 ; 0x4ae: L=0 E=0 D=0 S=0
kfffde[1].xptr.chk: 42 ; 0x4af: 0x2a
==》disk directory 位于 disk=0 的 aunum=3

NOTE: cache opening disk 0 of grp 2: DATADG_0000 path:/dev/asm-diske ==> あるいはasm-diske
[grid@vmac1 ~]$ kfed read /dev/asm-diske aunum=3 blknum=0|more
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 6 ; 0x002: KFBTYP_DISKDIR
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 2 ; 0x008: file=2
kfbh.check: 389127513 ; 0x00c: 0x17319d59
kfbh.fcn.base: 31299 ; 0x010: 0x00007a43
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kffdnd.bnode.incarn: 1 ; 0x000: A=1 NUMM=0x0
kffdnd.bnode.frlist.number: 4294967295 ; 0x004: 0xffffffff
kffdnd.bnode.frlist.incarn: 0 ; 0x008: A=0 NUMM=0x0
kffdnd.overfl.number: 4294967295 ; 0x00c: 0xffffffff
kffdnd.overfl.incarn: 0 ; 0x010: A=0 NUMM=0x0
kffdnd.parent.number: 0 ; 0x014: 0x00000000
kffdnd.parent.incarn: 1 ; 0x018: A=1 NUMM=0x0
kffdnd.fstblk.number: 0 ; 0x01c: 0x00000000
kffdnd.fstblk.incarn: 1 ; 0x020: A=1 NUMM=0x0
kfddde[0].entry.incarn: 1 ; 0x024: A=1 NUMM=0x0
kfddde[0].entry.hash: 0 ; 0x028: 0x00000000
kfddde[0].entry.refer.number:4294967295 ; 0x02c: 0xffffffff
kfddde[0].entry.refer.incarn: 0 ; 0x030: A=0 NUMM=0x0
………………………………………………….

kfddde 構造は disk directory構造で、kfddde[0].entry.incarn A=1だけを持っている記録は割り当てられた記録。A=0とはその記録が削除されたと意味している。

[grid@vmac1 ~]$ grep “kfddde\[5\]” disk.txt
kfddde[5].entry.incarn: 1 ; 0x8e4: A=1 NUMM=0x0
kfddde[5].entry.hash: 5 ; 0x8e8: 0x00000005
kfddde[5].entry.refer.number:4294967295 ; 0x8ec: 0xffffffff
kfddde[5].entry.refer.incarn: 0 ; 0x8f0: A=0 NUMM=0x0
kfddde[5].dsknum: 5 ; 0x8f4: 0x0005
kfddde[5].state: 2 ; 0x8f6: KFDSTA_NORMAL
kfddde[5].ddchgfl: 132 ; 0x8f7: 0x84
kfddde[5].dskname: DATADG_0005 ; 0x8f8: length=11
kfddde[5].fgname: DATADG_0005 ; 0x918: length=11
kfddde[5].crestmp.hi: 32984459 ; 0x938: HOUR=0xb DAYS=0xc MNTH=0x3 YEAR=0x7dd
kfddde[5].crestmp.lo: 2470649856 ; 0x93c: USEC=0x0 MSEC=0xc8 SECS=0x34 MINS=0x24
kfddde[5].failstmp.hi: 0 ; 0x940: HOUR=0x0 DAYS=0x0 MNTH=0x0 YEAR=0x0
kfddde[5].failstmp.lo: 0 ; 0x944: USEC=0x0 MSEC=0x0 SECS=0x0 MINS=0x0
kfddde[5].timer: 0 ; 0x948: 0x00000000
kfddde[5].size: 5120 ; 0x94c: 0x00001400
kfddde[5].srRloc.super.hiStart: 0 ; 0x950: 0x00000000
kfddde[5].srRloc.super.loStart: 0 ; 0x954: 0x00000000
kfddde[5].srRloc.super.length: 0 ; 0x958: 0x00000000
kfddde[5].srRloc.incarn: 0 ; 0x95c: 0x00000000
kfddde[5].dskrprtm: 0 ; 0x960: 0x00000000
kfddde[5].zones[0].start: 0 ; 0x964: 0x00000000
kfddde[5].zones[0].size: 5120 ; 0x968: 0x00001400
kfddde[5].zones[0].used: 2 ; 0x96c: 0x00000002

 

fix.txtを編集することに戻って、 crestmp.hi とcrestmp.lo が以上に示された情報とマッチしてください。

元々

[grid@vmac1 ~]$ egrep “hi|lo” fix.txt
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 2147483653 ; 0x008: disk=5
kfdhdb.crestmp.hi: 32983191 ; 0x0a8: HOUR=0x17 DAYS=0x4 MNTH=0x2 YEAR=0x7dd
kfdhdb.crestmp.lo: 2328519680 ; 0x0ac: USEC=0x0 MSEC=0x299 SECS=0x2c MINS=0x22
kfdhdb.mntstmp.hi: 32984468 ; 0x0b0: HOUR=0x14 DAYS=0xc MNTH=0x3 YEAR=0x7dd
kfdhdb.mntstmp.lo: 1231840256 ; 0x0b4: USEC=0x0 MSEC=0x319 SECS=0x16 MINS=0x12
kfdhdb.fstlocn: 1 ; 0x0cc: 0x00000001
kfdhdb.altlocn: 2 ; 0x0d0: 0x00000002
kfdhdb.f1b1locn: 0 ; 0x0d4: 0x00000000
kfdhdb.grpstmp.hi: 32983191 ; 0x0e4: HOUR=0x17 DAYS=0x4 MNTH=0x2 YEAR=0x7dd
kfdhdb.grpstmp.lo: 2328331264 ; 0x0e8: USEC=0x0 MSEC=0x1e1 SECS=0x2c MINS=0x22
リカバリしたあと

kfdhdb.crestmp.hi: 32984459 ; 0x938: HOUR=0xb DAYS=0xc MNTH=0x3 YEAR=0x7dd
kfdhdb.crestmp.lo: 2470649856 ; 0x93c: USEC=0x0 MSEC=0xc8 SECS=0x34 MINS=0x24
kfdhdb.mntstmp.hi: 32984468 ; 0x0b0: HOUR=0x14 DAYS=0xc MNTH=0x3 YEAR=0x7dd
kfdhdb.mntstmp.lo: 1231840256 ; 0x0b4: USEC=0x0 MSEC=0x319 SECS=0x16 MINS=0x12
そしてkfed mergeコマンドでdisk headerを合併する

kfed merge <device name> text=fix.txt

[grid@vmac1 ~]$ kfed merge /dev/asm-diskh text=fix.txt

 

ASMLIbを使っている場合に、以下のコマンドでheaderのasmli情報をリカバリしてください。

 

/etc/init.d/oracleasm force-renamedisk /dev/sdbg1 <ASMLIB Disk Name>
/etc/init.d/oracleasm scandisks
/etc/init.d/oracleasm listdisks

 

 

そしてASMインスタンスをnomountに起動する

SQL> startup nomount;
SQL> col path for a20
SQL> set linesize 200 pagesize 1400
SQL> select path,header_status,state from v$asm_disk;

PATH HEADER_STATUS STATE
——————– ———————————— ————————
/dev/asm-diskh MEMBER NORMAL
/dev/asm-diskg MEMBER NORMAL
/dev/asm-diskf MEMBER NORMAL
/dev/asm-diske MEMBER NORMAL
/dev/asm-diskc MEMBER NORMAL
/dev/asm-diskd MEMBER NORMAL
/dev/asm-diskb MEMBER NORMAL

7 rows selected.
ヘッダ情報はmemberか否かを確認する

 

 

 

[grid@vmac1 ~]$ kfed read /dev/asm-diskh
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 2147483653 ; 0x008: disk=5
kfbh.check: 3412972861 ; 0x00c: 0xcb6dd53d
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr: ORCLDISK ; 0x000: length=8
kfdhdb.driver.reserved[0]: 0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]: 0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000

 

そして alter diskgroup トラブルDG mount;

以上のステップにトラブルが起こらない場合にmount dgが無事に進行できる;
SQL>
SQL> alter diskgroup datadg mount;

Diskgroup altered.
NOTE: cache registered group DATADG number=1 incarn=0x01c845f0
NOTE: cache began mount (first) of group DATADG number=1 incarn=0x01c845f0
NOTE: Assigning number (1,5) to disk (/dev/asm-diskh)
NOTE: Assigning number (1,2) to disk (/dev/asm-diskg)
NOTE: Assigning number (1,1) to disk (/dev/asm-diskf)
NOTE: Assigning number (1,0) to disk (/dev/asm-diske)
Wed Mar 13 19:39:49 2013
NOTE: GMON heartbeating for grp 1
GMON querying group 1 at 56 for pid 27, osid 8690
NOTE: cache opening disk 0 of grp 1: DATADG_0000 path:/dev/asm-diske
NOTE: F1X0 found on disk 0 au 2 fcn 0.31322
NOTE: cache opening disk 1 of grp 1: DATADG_0001 path:/dev/asm-diskf
NOTE: cache opening disk 2 of grp 1: DATADG_0002 path:/dev/asm-diskg
NOTE: cache opening disk 5 of grp 1: DATADG_0005 path:/dev/asm-diskh
NOTE: cache mounting (first) external redundancy group 1/0x01C845F0 (DATADG)
Wed Mar 13 19:39:49 2013
* allocate domain 1, invalid = TRUE
kjbdomatt send to inst 2
Wed Mar 13 19:39:49 2013
NOTE: attached to recovery domain 1
NOTE: starting recovery of thread=1 ckpt=11.2351 group=1 (DATADG)
NOTE: advancing ckpt for group 1 (DATADG) thread=1 ckpt=11.2351
NOTE: cache recovered group 1 to fcn 0.33763
NOTE: redo buffer size is 256 blocks (1053184 bytes)
Wed Mar 13 19:39:49 2013
NOTE: LGWR attempting to mount thread 1 for diskgroup 1 (DATADG)
NOTE: LGWR found thread 1 closed at ABA 11.2350
NOTE: LGWR mounted thread 1 for diskgroup 1 (DATADG)
NOTE: LGWR opening thread 1 at fcn 0.33763 ABA 12.2351
NOTE: cache mounting group 1/0x01C845F0 (DATADG) succeeded
NOTE: cache ending mount (success) of group DATADG number=1 incarn=0x01c845f0
GMON querying group 1 at 57 for pid 18, osid 2911
Wed Mar 13 19:39:49 2013
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 1
SUCCESS: diskgroup DATADG was mounted
SUCCESS: alter diskgroup datadg mount
Wed Mar 13 19:39:49 2013
NOTE: diskgroup resource ora.DATADG.dg is online
NOTE: diskgroup resource ora.DATADG.dg is updated
Wed Mar 13 19:39:59 2013
NOTE: client PROD1:PROD registered, osid 10169, mbr 0x1
Wed Mar 13 19:40:11 2013
NOTE: ASM client PROD1:PROD disconnected unexpectedly.
NOTE: check client alert log.
NOTE: Trace records dumped in trace file /g01/orabase/diag/asm/+asm/+ASM1/trace/+ASM1_ora_10169.trc

 

 

 
注意 以上のリカバリは約20MBのasm headerがディスクヘッダデータをなくした場合だけに効ける。それより上回った場合にmount diskgroupできなくなるかもしれない。
もしfile numbe=4 Continuing Operations Directory (COD) – ASM file number 4もなくなったであれば、mountできなくなる。以下のインスタンスで、22MBをなくしたから。たとえkfed mergeであっても取り戻せない。
SELECT x.xnum_kffxp “Extent”,
x.au_kffxp “AU”,
x.disk_kffxp “Disk #”,
d.name “Disk name”
FROM x$kffxp x, v$asm_disk_stat d
WHERE x.group_kffxp=d.group_number
and x.disk_kffxp=d.disk_number
and x.group_kffxp=1
and x.number_kffxp=4
ORDER BY 1, 2;

Extent AU Disk # Disk name
———- ———- ———- ——————————————————————————————
0 21 5 DATADG_0005
1 16 1 DATADG_0001
2 33 2 DATADG_0002
3 34 0 DATADG_0000
4 22 5 DATADG_0005
5 34 2 DATADG_0002
6 35 0 DATADG_0000
7 33 1 DATADG_0001

SQL> alter diskgroup datadg dismount;

Diskgroup altered.

[grid@vmac1 ~]$ dd if=/dev/zero of=/dev/asm-diskh bs=1024k count=20
20+0 records in
20+0 records out
20971520 bytes (21 MB) copied, 0.0165823 s, 1.3 GB/s
[grid@vmac1 ~]$ kfed merge /dev/asm-diskh text=fix.txt

SQL> alter diskgroup datadg mount;

Diskgroup altered.

SQL> alter diskgroup datadg mount;

Diskgroup altered.
SQL> alter diskgroup datadg dismount;

Diskgroup altered.

[grid@vmac1 ~]$ dd if=/dev/zero of=/dev/asm-diskh bs=1024k count=21
21+0 records in
21+0 records out
22020096 bytes (22 MB) copied, 0.0182842 s, 1.2 GB/s
[grid@vmac1 ~]$ kfed merge /dev/asm-diskh text=fix.txt
SQL> alter diskgroup datadg mount;

Diskgroup altered.

 

[grid@vmac1 ~]$ dd if=/dev/zero of=/dev/asm-diskh bs=1024k count=22
22+0 records in
22+0 records out
23068672 bytes (23 MB) copied, 0.0312157 s, 739 MB/s
[grid@vmac1 ~]$ kfed merge /dev/asm-diskh text=fix.txt
SQL> alter diskgroup datadg mount;
alter diskgroup datadg mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15130: diskgroup “DATADG” is being dismounted
ORA-15066: offlining disk “DATADG_0005” in group “DATADG” may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26077] [endian_kfbh] [4] [0] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26077] [endian_kfbh] [4] [0] [0 != 1]

 

Errors in file /g01/orabase/diag/asm/+asm/+ASM1/trace/+ASM1_ora_8690.trc:
ORA-15196: invalid ASM block header [kfc.c:26077] [endian_kfbh] [4] [0] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26077] [endian_kfbh] [4] [0] [0 != 1]
ERROR: cache failed to read group=1(DATADG) fn=4 blk=0 from disk(s): 5(DATADG_0005)
ORA-15196: invalid ASM block header [kfc.c:26077] [endian_kfbh] [4] [0] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26077] [endian_kfbh] [4] [0] [0 != 1]
NOTE: cache initiating offline of disk 5 group DATADG
NOTE: process _user8690_+asm1 (8690) initiating offline of disk 5.3915953639 (DATADG_0005) with mask 0x7e in group 1
WARNING: Disk 5 (DATADG_0005) in group 1 in mode 0x7f is now being taken offline on ASM inst 1
NOTE: initiating PST update: grp = 1, dsk = 5/0xe968b5e7, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 108 for pid 27, osid 8690
ERROR: Disk 5 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 1)
WARNING: Offline of disk 5 (DATADG_0005) in group 1 and mode 0x7f failed on ASM inst 1
Wed Mar 13 20:00:56 2013
NOTE: halting all I/Os to diskgroup 1 (DATADG)
System State dumped to trace file /g01/orabase/diag/asm/+asm/+ASM1/trace/+ASM1_ora_8690.trc
NOTE: AMDU dump of disk group DATADG created at /g01/orabase/diag/asm/+asm/+ASM1/trace
ERROR: ORA-15130 signalled during mount of diskgroup DATADG
NOTE: cache dismounting (clean) group 1/0xEB784617 (DATADG)
NOTE: messaging CKPT to quiesce pins Unix process pid: 8690, image: oracle@vmac1 (TNS V1-V3)
NOTE: LGWR doing non-clean dismount of group 1 (DATADG)
NOTE: LGWR sync ABA=18.2360 last written ABA 18.2360
kjbdomdet send to inst 2
detach from dom 1, sending detach message to inst 2
Wed Mar 13 20:00:57 2013
List of instances:
1 2
Dirty detach reconfiguration started (new ddet inc 1, cluster inc 12)
Global Resource Directory partially frozen for dirty detach
* dirty detach – domain 1 invalid = TRUE
0 GCS resources traversed, 0 cancelled
Dirty Detach Reconfiguration complete
freeing rdom 1
WARNING: dirty detached from domain 1
NOTE: cache dismounted group 1/0xEB784617 (DATADG)
NOTE: cache ending mount (fail) of group DATADG number=1 incarn=0xeb784617
NOTE: cache deleting context for group DATADG 1/0xeb784617
GMON dismounting group 1 at 109 for pid 27, osid 8690
NOTE: Disk in mode 0x8 marked for de-assignment
NOTE: Disk in mode 0x8 marked for de-assignment
NOTE: Disk in mode 0x8 marked for de-assignment
NOTE: Disk in mode 0x8 marked for de-assignment
ERROR: diskgroup DATADG was not mounted
ORA-15032: not all alterations performed
ORA-15130: diskgroup “DATADG” is being dismounted
ORA-15066: offlining disk “DATADG_0005” in group “DATADG” may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26077] [endian_kfbh] [4] [0] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26077] [endian_kfbh] [4] [0] [0 != 1]
ERROR: alter diskgroup datadg mount
SQL> alter diskgroup datadg mount;
alter diskgroup datadg mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15130: diskgroup “DATADG” is being dismounted
ORA-15066: offlining disk “DATADG_0005” in group “DATADG” may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26077] [endian_kfbh] [4] [0] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26077] [endian_kfbh] [4] [0] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26077] [endian_kfbh] [4] [0] [0 !=1]
ORA-15196: invalid ASM block header [kfc.c:26077] [endian_kfbh] [4] [0] [0 !=1]

[grid@vmac1 trace]$ dd if=/dev/asm-diske of=/dev/asm-diskh bs=4096 skip=3 seek=3 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.000617397 s, 6.6 MB/s

 

[grid@vmac1 trace]$ dd if=/dev/asm-diske of=/dev/asm-diskh bs=4096 skip=4 seek=4 count=1
kfffde[0].xptr.au: 21 ; 0x4a0: 0x00000015
kfffde[0].xptr.disk: 5 ; 0x4a4: 0x0005
kfffde[0].xptr.flags: 0 ; 0x4a6: L=0 E=0 D=0 S=0
kfffde[0].xptr.chk: 58 ; 0x4a7: 0x3a
kfffde[1].xptr.au: 16 ; 0x4a8: 0x00000010
[grid@vmac1 trace]$ dd if=/dev/asm-diske of=/dev/asm-diskh bs=1048576 skip=21 seek=21 count=1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00296742 s, 353 MB/s

 

ORA-15066: offlining disk “DATADG_0005” in group “DATADG” may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26077] [obj_kfbl] [4] [0] [3 != 4]
ORA-15196: invalid ASM block header [kfc.c:26077] [obj_kfbl] [4] [0] [3 != 4]
ERROR: alter diskgroup datadg mount force

 

 

【Oracle ASMデータリカバリ】ORA-15032、ORA-15075 ASM DISK PATH Problem

プロのOracle Databaseの復旧サービスを提供
携帯番号: +86 13764045638 メール:service@parnassusdata.com

 

SQL> alter diskgroup datadg add disk ‘/dev/asm-diskh’ rebalance power 2;
alter diskgroup datadg add disk ‘/dev/asm-diskh’ rebalance power 2
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15075: disk(s) are not visible cluster-wide

 

SQL> alter diskgroup datadg add disk ‘/dev/asm-diskh’ ;
alter diskgroup datadg add disk ‘/dev/asm-diskh’
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15075: disk(s) are not visible cluster-wide
15032, 00000, “not all alterations performed”
// *Cause: At least one ALTER DISKGROUP action failed.
// *Action: Check the other messages issued along with this summary error.
//

15075, 00000, “disk(s) are not visible cluster-wide”
// *Cause: An ALTER DISKGROUP ADD DISK command specified a disk that could
// not be discovered by one or more nodes in a RAC cluster
// configuration.
// *Action: Determine which disks are causing the problem from the
// GV$ASM_DISK fixed view. Check operating system permissions
// for the device and the storage sub-system configuration on
// each node in a RAC cluster that cannot identify the disk.
//
SQL> alter diskgroup datadg add disk ‘/dev/asm-diskh’ ;

Diskgroup altered.
1 node

1* select GROUP_NUMBER,DISK_NUMBER,path,HEADER_STATUS from v$ASM_DISK order by 1,2
SQL> /

GROUP_NUMBER DISK_NUMBER PATH HEADER_STATUS
———— ———– ——————– ————————————
1 0 /dev/asm-diske MEMBER
1 1 /dev/asm-diskf MEMBER
1 2 /dev/asm-diskg MEMBER
1 5 /dev/asm-diskh MEMBER
2 0 /dev/asm-diskb MEMBER
2 1 /dev/asm-diskc MEMBER
2 2 /dev/asm-diskd MEMBER

7 rows selected.

 

SQL> select GROUP_NUMBER,DISK_NUMBER,path,HEADER_STATUS from v$ASM_DISK order by 1,2;

GROUP_NUMBER DISK_NUMBER PATH HEADER_STATUS
———— ———– ——————– ————————————
1 0 /dev/asm-diske MEMBER
1 1 /dev/asm-diskf MEMBER
1 2 /dev/asm-diskg MEMBER
1 5 /dev/asm-diskz MEMBER
2 0 /dev/asm-diskb MEMBER
2 1 /dev/asm-diskc MEMBER
2 2 /dev/asm-diskd MEMBER

7 rows selected.

 

NOTE: disk validation pending for group 1/0x5448421c (DATADG)
SUCCESS: validated disks for 1/0x5448421c (DATADG)
NOTE: disk validation pending for group 1/0x5448421c (DATADG)
NOTE: Assigning number (1,5) to disk (/dev/asm-diskz)
SUCCESS: validated disks for 1/0x5448421c (DATADG)
NOTE: membership refresh pending for group 1/0x5448421c (DATADG)
Tue Mar 12 20:36:59 2013
GMON querying group 1 at 7 for pid 18, osid 2763
NOTE: cache opening disk 5 of grp 1: DATADG_0005 path:/dev/asm-diskz
GMON querying group 1 at 8 for pid 18, osid 2763
SUCCESS: refreshed membership for 1/0x5448421c (DATADG)
Tue Mar 12 20:37:02 2013
NOTE: Attempting voting file refresh on diskgroup DATADG
Tue Mar 12 20:37:06 2013
ALTER SYSTEM SET local_listener=’ (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.1.184)(PORT=1521))))’ SCOPE=MEMORY SID=’+ASM2′;
Tue Mar 12 20:37:54 2013
NOTE: membership refresh pending for group 1/0x5448421c (DATADG)
Tue Mar 12 20:37:56 2013
GMON querying group 1 at 9 for pid 18, osid 2763
SUCCESS: refreshed membership for 1/0x5448421c (DATADG)
NOTE: Attempting voting file refresh on diskgroup DATADG

 

 

 

 

[grid@vmac2 ~]$ kfed read /dev/asm-diskz
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 2147483653 ; 0x008: disk=5
kfbh.check: 1582693660 ; 0x00c: 0x5e55fd1c
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr: ORCLDISK ; 0x000: length=8
kfdhdb.driver.reserved[0]: 0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]: 0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000
kfdhdb.compat: 186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum: 5 ; 0x024: 0x0005
kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname: DATADG_0005 ; 0x028: length=11
kfdhdb.grpname: DATADG ; 0x048: length=6
kfdhdb.fgname: DATADG_0005 ; 0x068: length=11
kfdhdb.capname: ; 0x088: length=0
kfdhdb.crestmp.hi: 32984459 ; 0x0a8: HOUR=0xb DAYS=0xc MNTH=0x3 YEAR=0x7dd
kfdhdb.crestmp.lo: 2470649856 ; 0x0ac: USEC=0x0 MSEC=0xc8 SECS=0x34 MINS=0x24
kfdhdb.mntstmp.hi: 32984459 ; 0x0b0: HOUR=0xb DAYS=0xc MNTH=0x3 YEAR=0x7dd
kfdhdb.mntstmp.lo: 2470650880 ; 0x0b4: USEC=0x0 MSEC=0xc9 SECS=0x34 MINS=0x24
kfdhdb.secsize: 512 ; 0x0b8: 0x0200
kfdhdb.blksize: 4096 ; 0x0ba: 0x1000
kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000
kfdhdb.mfact: 113792 ; 0x0c0: 0x0001bc80
kfdhdb.dsksize: 5120 ; 0x0c4: 0x00001400
kfdhdb.pmcnt: 2 ; 0x0c8: 0x00000002
kfdhdb.fstlocn: 1 ; 0x0cc: 0x00000001
kfdhdb.altlocn: 2 ; 0x0d0: 0x00000002
kfdhdb.f1b1locn: 0 ; 0x0d4: 0x00000000
kfdhdb.redomirrors[0]: 0 ; 0x0d8: 0x0000
kfdhdb.redomirrors[1]: 0 ; 0x0da: 0x0000
kfdhdb.redomirrors[2]: 0 ; 0x0dc: 0x0000
kfdhdb.redomirrors[3]: 0 ; 0x0de: 0x0000
kfdhdb.dbcompat: 168820736 ; 0x0e0: 0x0a100000
kfdhdb.grpstmp.hi: 32983191 ; 0x0e4: HOUR=0x17 DAYS=0x4 MNTH=0x2 YEAR=0x7dd
kfdhdb.grpstmp.lo: 2328331264 ; 0x0e8: USEC=0x0 MSEC=0x1e1 SECS=0x2c MINS=0x22
kfdhdb.vfstart: 0 ; 0x0ec: 0x00000000
kfdhdb.vfend: 0 ; 0x0f0: 0x00000000
kfdhdb.spfile: 0 ; 0x0f4: 0x00000000
kfdhdb.spfflg: 0 ; 0x0f8: 0x00000000
kfdhdb.ub4spare[0]: 0 ; 0x0fc: 0x00000000
kfdhdb.ub4spare[1]: 0 ; 0x100: 0x00000000
kfdhdb.ub4spare[2]: 0 ; 0x104: 0x00000000
kfdhdb.ub4spare[3]: 0 ; 0x108: 0x00000000
kfdhdb.ub4spare[4]: 0 ; 0x10c: 0x00000000
kfdhdb.ub4spare[5]: 0 ; 0x110: 0x00000000
kfdhdb.ub4spare[6]: 0 ; 0x114: 0x00000000
kfdhdb.ub4spare[7]: 0 ; 0x118: 0x00000000
kfdhdb.ub4spare[8]: 0 ; 0x11c: 0x00000000
kfdhdb.ub4spare[9]: 0 ; 0x120: 0x00000000
kfdhdb.ub4spare[10]: 0 ; 0x124: 0x00000000
kfdhdb.ub4spare[11]: 0 ; 0x128: 0x00000000
kfdhdb.ub4spare[12]: 0 ; 0x12c: 0x00000000
kfdhdb.ub4spare[13]: 0 ; 0x130: 0x00000000
kfdhdb.ub4spare[14]: 0 ; 0x134: 0x00000000
kfdhdb.ub4spare[15]: 0 ; 0x138: 0x00000000
kfdhdb.ub4spare[16]: 0 ; 0x13c: 0x00000000
kfdhdb.ub4spare[17]: 0 ; 0x140: 0x00000000
kfdhdb.ub4spare[18]: 0 ; 0x144: 0x00000000
kfdhdb.ub4spare[19]: 0 ; 0x148: 0x00000000
kfdhdb.ub4spare[20]: 0 ; 0x14c: 0x00000000
kfdhdb.ub4spare[21]: 0 ; 0x150: 0x00000000
kfdhdb.ub4spare[22]: 0 ; 0x154: 0x00000000
kfdhdb.ub4spare[23]: 0 ; 0x158: 0x00000000
kfdhdb.ub4spare[24]: 0 ; 0x15c: 0x00000000
kfdhdb.ub4spare[25]: 0 ; 0x160: 0x00000000
kfdhdb.ub4spare[26]: 0 ; 0x164: 0x00000000
kfdhdb.ub4spare[27]: 0 ; 0x168: 0x00000000
kfdhdb.ub4spare[28]: 0 ; 0x16c: 0x00000000
kfdhdb.ub4spare[29]: 0 ; 0x170: 0x00000000
kfdhdb.ub4spare[30]: 0 ; 0x174: 0x00000000
kfdhdb.ub4spare[31]: 0 ; 0x178: 0x00000000
kfdhdb.ub4spare[32]: 0 ; 0x17c: 0x00000000
kfdhdb.ub4spare[33]: 0 ; 0x180: 0x00000000
kfdhdb.ub4spare[34]: 0 ; 0x184: 0x00000000
kfdhdb.ub4spare[35]: 0 ; 0x188: 0x00000000
kfdhdb.ub4spare[36]: 0 ; 0x18c: 0x00000000
kfdhdb.ub4spare[37]: 0 ; 0x190: 0x00000000
kfdhdb.ub4spare[38]: 0 ; 0x194: 0x00000000
kfdhdb.ub4spare[39]: 0 ; 0x198: 0x00000000
kfdhdb.ub4spare[40]: 0 ; 0x19c: 0x00000000
kfdhdb.ub4spare[41]: 0 ; 0x1a0: 0x00000000
kfdhdb.ub4spare[42]: 0 ; 0x1a4: 0x00000000
kfdhdb.ub4spare[43]: 0 ; 0x1a8: 0x00000000
kfdhdb.ub4spare[44]: 0 ; 0x1ac: 0x00000000
kfdhdb.ub4spare[45]: 0 ; 0x1b0: 0x00000000
kfdhdb.ub4spare[46]: 0 ; 0x1b4: 0x00000000
kfdhdb.ub4spare[47]: 0 ; 0x1b8: 0x00000000
kfdhdb.ub4spare[48]: 0 ; 0x1bc: 0x00000000
kfdhdb.ub4spare[49]: 0 ; 0x1c0: 0x00000000
kfdhdb.ub4spare[50]: 0 ; 0x1c4: 0x00000000
kfdhdb.ub4spare[51]: 0 ; 0x1c8: 0x00000000
kfdhdb.ub4spare[52]: 0 ; 0x1cc: 0x00000000
kfdhdb.ub4spare[53]: 0 ; 0x1d0: 0x00000000
kfdhdb.acdb.aba.seq: 0 ; 0x1d4: 0x00000000
kfdhdb.acdb.aba.blk: 0 ; 0x1d8: 0x00000000
kfdhdb.acdb.ents: 0 ; 0x1dc: 0x0000
kfdhdb.acdb.ub2spare: 0 ; 0x1de: 0x0000

 

 

 

 

 

以上のORA-15032、ORA-15075 ASM DISK PATH ProblemはLUN配置が間違えたから。LUNがRACの複数のnodeで見えることによるものである。

 

このトラブルに対して、まずは合理的なファイル権限を設定したことを確認してください。

そして以下のコマンドを実行してください:

 

 

echo “======================” > fakedata.txt

date >> fakedata.txt

echo “======================” >> fakedata.txt

 

dd if=/home/oracle/test/fakedata.txt of=テストしたい設備の名前 bs=4096 count=1 conv=notrunc

 

後で使うodコマンドは複数のnodeで命令在多个节点上查看该设备上的内容即可确认在集群环境中该设备是否指向同一个LUN

 

od -c テストしたい設備の名前 | more

 

【ASMデータリカバリ】どうやってASM Disk header_status=FORMERのディスクをDiskgroupに加えて、 ORA-15017 ORA-15063 ORA-15032になった

プロのOracle Databaseの復旧サービスを提供
携帯番号: +86 13764045638 メール:service@parnassusdata.com

 

以下の状況が現れるかもしれない。誤操作でDISKGROUPをDROPしたあるいは。或Bug 13331814: ASM DISKS TURNED INTO FORMER WHILE DISKGROUP IS MOUNTEDでASM DISKのheader_status=FORMERがまともなMEMBER状態ではない。

 

例えば

 

 

[oracle@mlab2 ~]$ sqlplus  / as sysasm

 

SQL*Plus: Release 11.2.0.3.0 Production on Tue Nov 19 21:55:09 2013

 

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

 

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – 64bit Production

With the Automatic Storage Management option

 

SQL>  create diskgroup maclean external redundancy disk ‘/dev/asm-disk9′;

 

Diskgroup created.

 

SQL> select group_number,name,state from v$asm_diskgroup;

 

GROUP_NUMBER NAME                           STATE

———— —————————— ———–

1 DATA                           MOUNTED

2 MACLEAN                        MOUNTED

 

SQL> col path for a40

SQL> select name,path,header_status from v$asm_disk where group_number=2;

 

NAME                           PATH

—————————— —————————————-

HEADER_STATU

————

MACLEAN_0000                   /dev/asm-disk9

MEMBER

 

 

 

 

SQL> drop diskgroup maclean;

 

Diskgroup dropped.

 

SQL> alter diskgroup maclean mount;

alter diskgroup maclean mount

*

ERROR at line 1:

ORA-15032: not all alterations performed

ORA-15017: diskgroup “MACLEAN” cannot be mounted

ORA-15063: ASM discovered an insufficient number of disks for diskgroup

“MACLEAN”

 

SQL> select name,path,header_status from v$asm_disk where path=’/dev/asm-disk9’;

 

NAME                           PATH

—————————— —————————————-

HEADER_STATU

————

/dev/asm-disk9

FORMER

 

 

[oracle@mlab2 ~]$ kfed read /dev/asm-disk9 |head -25

kfbh.endian:                          1 ; 0x000: 0x01

kfbh.hard:                          130 ; 0x001: 0x82

kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD

kfbh.datfmt:                          1 ; 0x003: 0x01

kfbh.block.blk:                       0 ; 0x004: blk=0

kfbh.block.obj:              2147483648 ; 0x008: disk=0

kfbh.check:                   554377417 ; 0x00c: 0x210b20c9

kfbh.fcn.base:                        0 ; 0x010: 0x00000000

kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000

kfbh.spare1:                          0 ; 0x018: 0x00000000

kfbh.spare2:                          0 ; 0x01c: 0x00000000

kfdhdb.driver.provstr:      ORCLDISK  ; 0x000: length=11

kfdhdb.driver.reserved[0]:        65796 ; 0x008: 0x00010104

kfdhdb.driver.reserved[1]:            1 ; 0x00c: 0x00000001

kfdhdb.driver.reserved[2]:      4206569 ; 0x010: 0x00402fe9

kfdhdb.driver.reserved[3]:      3367865 ; 0x014: 0x003363b9

kfdhdb.driver.reserved[4]:    196018176 ; 0x018: 0x0baf0000

kfdhdb.driver.reserved[5]:    390595073 ; 0x01c: 0x17480201

kfdhdb.compat:                168820736 ; 0x020: 0x0a100000

kfdhdb.dsknum:                        0 ; 0x024: 0x0000

kfdhdb.grptyp:                        1 ; 0x026: KFDGTP_EXTERNAL

kfdhdb.hdrsts:                        4 ; 0x027: KFDHDR_FORMER

kfdhdb.dskname:            MACLEAN_0000 ; 0x028: length=12

kfdhdb.grpname:                 MACLEAN ; 0x048: length=7

kfdhdb.fgname:             MACLEAN_0000 ; 0x068: length=12

 

 

 

ここのkfdhdb.hdrsts: 4 ; 0x027: KFDHDR_FORMER はそのディスクの状態はFORMERと意味している。

 

 

まずは該当する ASM DISKのheaderをバックアップする

 

[oracle@mlab2 ~]$ mkdir /tmp/asm

[oracle@mlab2 ~]$ dd if=/dev/asm-disk9 of=/tmp/asm/asm-disk9-header bs=1024k count=20

 

[oracle@mlab2 ~]$ kfed read /dev/asm-disk9 > /tmp/asm/asm-disk9-meta

 

[oracle@mlab2 ~]$ ls -l /tmp/asm/asm-disk9-meta

-rw-r–r– 1 oracle oinstall 6597 Nov 20 01:26 /tmp/asm/asm-disk9-meta

 

修改asm-disk9-meta这个文本中的内容

 

kfdhdb.hdrsts: 4 ; 0x027: KFDHDR_FORMER

 

に改正

 

kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER

 

以下の    2图の変更のように:

hdrsts

 

 

 

 

 

 

 

 

 

 

hdrsts2

後で修正できたあとメタデータ情報テキストを使って、ASM DISKをpatchすることができるようになった、具体的なコマンドは以下の通り:

 

 

[oracle@mlab2 ~]$ kfed merge /dev/asm-disk9 text=/tmp/asm/asm-disk9-meta

 

再び確認する

 

[oracle@mlab2 ~]$ kfed read /dev/asm-disk9 |grep hdrsts

kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER

 

 

最後でそのディスクに該当するDISKGROUPをMOUNTする:

 

SQL> alter diskgroup maclean mount;

 

Diskgroup altered.

 

SQL> select name,state from v$asm_diskgroup;

 

NAME                           STATE

—————————— ———–

DATA                           MOUNTED

MACLEAN                        MOUNTED

 

【Oracleデータベースリカバリ】ORA-00600[25026】エラ解析

 

プロのOracle Databaseの復旧サービスを提供
携帯番号: +86 13764045638 メール:service@parnassusdata.com

 

 

一般的にORA-00600[25026]エラはORACLEがあるtablespaceテーブルスペースをテストするときに、使えないtablespace IDあるいはRDBAで引き起こされたものである。その二つの変数の意味は:

Arg [a] — tablespace ID
Arg [b] — rdba

 

ORA-00600[25026]エラはKernel File management Tablespace componentモジュールに属して、可能な影響はインディクスを失敗させるあるいはブロックを物理的に壊すなど。

関連するBUGリストは以下の通り:

 

NB Bug Fixed Description
13503554 11.2.0.4, 12.2.0.0 Various ORA-600 errors crashing the apply process in a downstreams environment
17604137 12.1.0.2, 12.2.0.0 ORA-600 [25026] when running query on table being dropped
12912297 11.2.0.3.BP07, 11.2.0.4, 12.1.0.1 CREATE GLOBAL TEMPORARY TABLE incorrectly allows a tablespace group causing ORA-600 [25026] on insert
+ 9399991 11.1.0.7.5, 11.2.0.1.3, 11.2.0.1.BP04, 11.2.0.2, 12.1.0.1 Assorted Internal Errors and Dumps (mostly under kkpa*/kcb*) from SQL against partitioned tables
8630146 10.2.0.5, 11.2.0.1 OERI [25026] / OERI [25012] by SMON while recovering a transaction against a dropped tablespace
6249879 10.2.0.5, 11.2.0.1 OERI [kcbgcur_1] / [25026] / ORA-1502 from DML on table with CONSTRAINTS
4545196 10.2.0.4, 11.1.0.6 Corrupt index leaf by RMAN CONVERT from cross platform transport of compress-key indexes

 

【Oracleデータリカバリ】BBEDでORA-600[4193]とORA-600[4194]をリカバリした例

プロのOracle Databaseの復旧サービスを提供
携帯番号: +86 13764045638 メール:service@parnassusdata.com

 

ORA-600[4193]の原因はORA-600[4194]の場合に似ている。Oracleがredo記録とrollback/undoデータが一致していないから。プロセスundo blocのundo block sequence number及び該当するredo block sequence numberをテストするときに一致していないと気づき、エラになる。 ORA-600 [4193] “seq# mismatch while adding undo record”と呼ばれている。

ORA-600[4913]エラの二つのargumentの。そのORA-600[4913]はカーネルundoトランザクションモジュールに属している

[a] Undo record seq number

[b] Redo record seq number

 

ORA-600[4913]に関するbugは以下の通り:

14034244 11.2.0.3.BP09,  12.1.0.0 Lost write type corruption using ASM in 11.2.0.3

8240762 10.2.0.5,11.1.0.7.10,11.2.0.1 Undo corruptions with ORA-600 [4193]/ORA-600 [4194] or ORA-600 [4137] /SMON may spin to recover transaction

もし4193/4194 に関するundo/rollback blockがシステムセグメントをロールバックするときに、隠しバラメタを使ってもこのトラブルを避けられない。_CORRUPTED_ROLLBACK_SEGMENTS隠しバラメタ。

人工的にBBEDでリカバリする。以下は具体的な例:

 

 

以下はsystem ロールバックセグメントヘッダで、その位置はfile 1 block# 9一部のダンプ情報

TRN CTL:: seq: 0x003a chd: 0x0017 ctl: 0x0052 inc: 0x00000000 nfb: 0x0001

mgc: 0x8002 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)

uba: 0x00400197.003a.02 scn: 0x0000.004fbbf0

Version: 0x01

FREE BLOCK POOL::

uba: 0x00400197.003a.02 ext: 0x4  spc: 0x1dd2

uba: 0x00000000.0037.05 ext: 0x1  spc: 0x1d6c

uba: 0x00000000.0035.37 ext: 0x5  spc: 0x538

uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0

 

  1. まずはBBEDで使う:

 

cd $ORACLE_HOME/rdbms/lib

make -f ins_rdbms.mk `pwd`/bbed

mv bbed $ORACLE_HOME/bin

 

  1. システムテーブルスペースを含むデータファイルの情報をfile.lisに格納する:

 

例のfile.lis

 

1 /oradata/s102/system01.dbf 524288000

 

  1. bbedのバラメタファイルbbed.parを作成する

 

bbed.par 内容は:

 

MODE=EDIT

LISTFILE=

BLOCKSIZE=<db_block_size>

 

この例のbbed.parの内容

 

MODE=EDIT

LISTFILE=file.lis

BLOCKSIZE=8192

 

  1. bbedを実行 する。そのパスワードはblockedit

 

$ bbed parfile=bbed.par

Password:

 

BBED: Release 2.0.0.0.0 – Limited Production on Thu Sep 27 10:06:25 2007

 

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

 

************* !!! For Oracle Internal Use only !!! ***************

 

BBED>

 

  1. 今のブロックをsystem rollback segmentのheader block と設定してください。

 

  1. Run map to see the C structures for the block and the DBA:

 

BBED> map

File: /oradata/s102/system01.dbf (1)

Block: 9                                     Dba:0x00400009

————————————————————

Unlimited Undo Segment Header

 

struct kcbh, 20 bytes                      @0

 

struct ktech, 72 bytes                     @20

 

struct ktemh, 16 bytes                     @92

 

struct ktetb[6], 48 bytes                  @108

 

struct ktuxc, 104 bytes                    @4148

 

struct ktuxe[255], 10200 bytes             @4252

 

ub4 tailchk                                @8188

 

Note that dba=0x00400009 is file 1 block 9, so we are positioned in the correct block.

 

  1. Print the structure ktuxc:

 

BBED> print ktuxc

struct ktuxc, 104 bytes                     @4148

struct ktuxcscn, 8 bytes                 @4148

ub4 kscnbas                           @4148     0x004fbbf1

ub2 kscnwrp                           @4152     0x0000

struct ktuxcuba, 8 bytes                 @4156

ub4 kubadba                           @4156     0x00400197

ub2 kubaseq                           @4160     0x003a

ub1 kubarec                           @4162     0x03

sb2 ktuxcflg                             @4164     1 (KTUXCFSK)

ub2 ktuxcseq                             @4166     0x003a

sb2 ktuxcnfb                             @4168     1

ub4 ktuxcinc                             @4172     0x00000000

sb2 ktuxcchd                             @4176     6

sb2 ktuxcctl                             @4178     23

ub2 ktuxcmgc                             @4180     0x8002

ub4 ktuxcopt                             @4188     0x7ffffffe

struct ktuxcfbp[0], 12 bytes             @4192

struct ktufbuba, 8 bytes              @4192

ub4 kubadba                        @4192     0x00400197

ub2 kubaseq                        @4196     0x003a

ub1 kubarec                        @4198     0x0c

sb2 ktufbext                          @4200     4

sb2 ktufbspc                          @4202     5630

 

  1. Modify ktuxc.ktuxcnfb to 0x0000

 

BBED> set offset ktuxc.ktuxcnfb

OFFSET          4168

 

BBED> print

ktuxc.ktuxcnfb

————–

sb2 ktuxcnfb                                @4168     1

 

BBED> modify 0x0000

File: /oradata/s102/system01.dbf (1)

Block: 9                Offsets: 4168 to 4679           Dba:0x00400009

————————————————————————

00000000 00000000 06001700 02800100 68000000 feffff7f 97014000 3a000c00

0400fe15 00000000 37000500 01006c1d 00000000 35003700 05003805 00000000

00000000 00000000 00000000 00000000 00000000 30000000 93014000 191f5300

00000000 09005f00 00000000 00000000 00000000 01000000 00000000 31000000

96014000 a03e5b00 00000000 09005c00 00000000 00000000 00000000 01000000

00000000 31000000 96014000 9e3e5b00 00000000 09000e00 00000000 00000000

00000000 01000000 00000000 30000000 93014000 f4bb4f00 00000000 09001600

00000000 00000000 00000000 01000000 00000000 31000000 96014000 c13a5b00

00000000 09004800 00000000 00000000 00000000 01000000 00000000 31000000

96014000 983e5b00 00000000 09006000 00000000 00000000 00000000 01000000

00000000 30000000 93014000 f2bb4f00 00000000 09001400 00000000 00000000

00000000 01000000 00000000 31000000 96014000 933e5b00 00000000 09006100

00000000 00000000 00000000 01000000 00000000 31000000 96014000 8d3e5b00

00000000 09004700 00000000 00000000 00000000 01000000 00000000 30000000

94014000 87d15900 00000000 09002100 00000000 00000000 00000000 01000000

00000000 30000000 94014000 211f5300 00000000 09001d00 00000000 00000000

 

  1. Modify ktuxc.ktuxcfbp[0].ktufbuba to 0x00000000

 

BBED> set offset ktuxc.ktuxcfbp[0].ktufbuba

OFFSET          4192

 

BBED> print

ktuxc.ktuxcfbp[0].ktufbuba.kubadba

———————————-

ub4 kubadba                                 @4192     0x00400197

 

BBED> modify 0x00000000

File: /oradata/s102/system01.dbf (1)

Block: 9                Offsets: 4192 to 4703           Dba:0x00400009

————————————————————————

00000000 3a000c00 0400fe15 00000000 37000500 01006c1d 00000000 35003700

05003805 00000000 00000000 00000000 00000000 00000000 00000000 30000000

93014000 191f5300 00000000 09005f00 00000000 00000000 00000000 01000000

00000000 31000000 96014000 a03e5b00 00000000 09005c00 00000000 00000000

00000000 01000000 00000000 31000000 96014000 9e3e5b00 00000000 09000e00

00000000 00000000 00000000 01000000 00000000 30000000 93014000 f4bb4f00

00000000 09001600 00000000 00000000 00000000 01000000 00000000 31000000

96014000 c13a5b00 00000000 09004800 00000000 00000000 00000000 01000000

00000000 31000000 96014000 983e5b00 00000000 09006000 00000000 00000000

00000000 01000000 00000000 30000000 93014000 f2bb4f00 00000000 09001400

00000000 00000000 00000000 01000000 00000000 31000000 96014000 933e5b00

00000000 09006100 00000000 00000000 00000000 01000000 00000000 31000000

96014000 8d3e5b00 00000000 09004700 00000000 00000000 00000000 01000000

00000000 30000000 94014000 87d15900 00000000 09002100 00000000 00000000

00000000 01000000 00000000 30000000 94014000 211f5300 00000000 09001d00

00000000 00000000 00000000 01000000 00000000 30000000 93014000 0d1f5300

 

BBED>

 

  1. Disable the block Checksum by changing the kcbh.flg_kcbh-4 and kcbh.chkval_kcbh to 0x0000:

 

BBED> map

File: /oradata/s102/system01.dbf (1)

Block: 9                                     Dba:0x00400009

————————————————————

Unlimited Undo Segment Header

 

struct kcbh, 20 bytes                      @0

 

struct ktech, 72 bytes                     @20

 

struct ktemh, 16 bytes                     @92

 

struct ktetb[6], 48 bytes                  @108

 

struct ktuxc, 104 bytes                    @4148

 

struct ktuxe[255], 10200 bytes             @4252

 

ub4 tailchk                                @8188

 

BBED> print kcbh

struct kcbh, 20 bytes                       @0

ub1 type_kcbh                            @0        0x0e

ub1 frmt_kcbh                            @1        0xa2

ub1 spare1_kcbh                          @2        0x00

ub1 spare2_kcbh                          @3        0x00

ub4 rdba_kcbh                            @4        0x00400009

ub4 bas_kcbh                             @8        0x005b3f76

ub2 wrp_kcbh                             @12       0x0000

ub1 seq_kcbh                             @14       0x01

ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)

ub2 chkval_kcbh                          @16       0xe264

ub2 spare3_kcbh                          @18       0x0000

 

BBED> set offset kcbh.flg_kcbh

OFFSET          15

 

BBED> print

kcbh.flg_kcbh

————-

ub1 flg_kcbh                                @15       0x04 (KCBHFCKV)

 

BBED> modify 0x00

File: /oradata/s102/system01.dbf (1)

Block: 9                Offsets:   15 to  526           Dba:0x00400009

————————————————————————

0064e200 00000000 00000000 00000000 00000000 00060000 002f0000 00201000

00040000 00060000 00080000 00970140 00000000 00040000 00000000 00000000

00000000 00000000 00000000 00060000 00000000 00000000 00000000 400a0040

00070000 00110040 00080000 00810140 00080000 00890140 00080000 00910140

00080000 00990140 00080000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 

BBED> set offset kcbh.chkval_kcbh

OFFSET          16

 

BBED> print

kcbh.chkval_kcbh

—————-

ub2 chkval_kcbh                             @16       0xe264

 

BBED> modify 0x0000

File: /oradata/s102/system01.dbf (1)

Block: 9                Offsets:   16 to  527           Dba:0x00400009

————————————————————————

00000000 00000000 00000000 00000000 00000000 06000000 2f000000 20100000

04000000 06000000 08000000 97014000 00000000 04000000 00000000 00000000

00000000 00000000 00000000 06000000 00000000 00000000 00000040 0a004000

07000000 11004000 08000000 81014000 08000000 89014000 08000000 91014000

08000000 99014000 08000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 

  1. Verify the the block has no corruptions:

 

BBED> verify

DBVERIFY – Verification starting

FILE = /oradata/s102/system01.dbf

BLOCK = 9

 

DBVERIFY – Verification complete

 

Total Blocks Examined         : 1

Total Blocks Processed (Data) : 0

Total Blocks Failing   (Data) : 0

Total Blocks Processed (Index): 0

Total Blocks Failing   (Index): 0

Total Blocks Empty            : 0

Total Blocks Marked Corrupt   : 0

Total Blocks Influx           : 0

 

  1. exit, open the database and shrink the system rollback segment:

 

BBED> exit

 

[oracle@arem example]$ sqlplus / as sysdba

 

SQL*Plus: Release 10.2.0.3.0 – Production on Thu Sep 27 10:28:00 2007

 

Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.

 

Connected to an idle instance.

 

SQL> startup

ORACLE instance started.

 

Total System Global Area  167772160 bytes

Fixed Size                  1260696 bytes

Variable Size              62915432 bytes

Database Buffers          100663296 bytes

Redo Buffers                2932736 bytes

Database mounted.

Database opened.

 

SQL> alter rollback segment system shrink;

 

Rollback segment altered.

 

SQL>

 

【Oracleデータリカバリ】ORA-00600:[2846] ORA-01498

プロのOracle Databaseの復旧サービスを提供
携帯番号: +86 13764045638 メール:service@parnassusdata.com

 

 

updatable snapshotを使っていて、C_MLOG# clusterもこわれた時に現れる

ORA-00600: internal error code, arguments: [2846], [1], [65535], [1], [8192], [8192]
kdcchk: error when looking at index with key:
col 0; len 16; (16): 43 4c 49 45 4e 54 5f 47 52 50 5f 45 4e 54 52 59
col 1; len 6; (6):44 41 44 4d 49 4e
kdcchk: index points to block 0x00401667 slot 0x0 chain length is 1
kdcchk: chain pointer 0x00401667.0 points to row which does not match!

Running ANALYZE CLUSTER C_MLOG# VALIDATE STRUCTURE causes:
ORA-01498: block check failure – see trace file
kdbchk: the amount of space used is not equal to block size
used=413 fsc=838 avsp=6823 dtl=8096
Block header dump: rdba: 0x00401667
Object id on Block? Y
seg/obj: 0x84 csc: 0x00.5ac50c6 itc: 2 flg: O typ: 1 – DATA
fsl: 0 fnx: 0x4016de ver: 0x01

これはC_MLOG#クラスタが壊されたと意味している;具体的な解決策は:
1、クラスタテーブルからデータを引き出す
2、元のテーブル、クラスタインディクス及びクラスタをDROPする
3、これらのclusterを再構造する
4、データを元のテーブルに戻らせる。

【Oracleデータリカバリ】ORA-00600[2662]エラ解析

プロのOracle Databaseの復旧サービスを提供
携帯番号: +86 13764045638 メール:service@parnassusdata.com

 

 

 

ORA-600[2662]エラの原因はORACLEがブロックを含んでいるSCN番号はシステムに既存したSCNより大きいである。一般的に、ロールフォワードプロセスプロセスで利用されたredoは不完全で、これはシステムSCNがディスクに書き込まれたSCNより小さいくなる。SCN adjust scn 《人工的にSCN番号を増やす方法:How to increase System Change Number by manual》

 

ORA-00600[2662]関連する bugは以下の通り:

Bug 4453449
Abstract: Flashback to guaranteed restore point in orphan inc may result in ORA-600[3020]
Versions affected: 10.2.0.1
Fixed in versions: 10.2.0.2 & 11.0
Backportable: Yes
Symptoms:
The symptom of this bug include ORA-600[3020], ORA-600[2662] after flashback
database and ORA-600[flashback_validation] during flashback database.
There may also be other symptoms.
Details:
ORA-600[3020] / ORA-600 [2662] / ORA-600 [flashback_validation] can occur
after/during multiple flashback/recovery through multiple database resetlogs
without opening the database. There may also be other symptoms which appear as
recovery related corruption errors.
Workaround:
1. If you flashback a crashed primary database, follow flashback database with open
resetlogs. Alternatively, if you’d like to completely undo flashback database,
follow flashback database with recover database without shutting down the
instance first.
2. Restore backup and recover.
Patch details:
Currently there is no one-off patch available for any platform and versions.
Bug 2899477 (Unpublished)
Abstract:ORA-600[2662] CAUSES INSTANCE CRASH
Versions affected: 9.2.0.4
Fixed in versions: 9.2.0.4 & 10.1
Backportable: Yes
Symptoms:
When you have a corrupted SCN and if the corruption is found in selexe,
getting uninitialized selenv from opiexe, then this may be the bug.
Details:
It is possible for an uninitialized variable to be passed
on for a select statement which could result in a false
ORA-600 [2662] error.
Workaround:
None
Patch details:
One-off patch available for few platforms on top of 9.2.0.4
Check the Metalink for Patch 2899477 availability.
Bug 2764106
Abstract: ORA-600 [2662] BRINGS THE DATABASE DOWN
Versions affected: 8.1.7.4 & 9.2.0.4
Fixed in versions: 9.2.0.5 & 10.1
Backportable: Yes
Symptoms:
OERI(2662) even The dependent scn present in the disk blocks are fine.
Details:
A false ORA-600 [2662] error can occur on SELECT operations
which can result in an instance crash even though there is no
underlying problem with the on disk SCN.
Workaround:
None
Patch details:
One-off patch available for few platforms on top of 8.1.7.4 & 9.2.0.4
Check the Metalink for Patch 2764106 availability.
Bug 2216823 (Unpublished)
Abstract:OERI(2662) REPORTED WHEN REUSING TEMPFILE WITH RESTORED DB
Versions affected: 9.2.0
Fixed in versions: 10.1.0
Backportable: No
Symptoms:
eg:
1. Create a TEMP tablespace.
2. Shutdown a database.
3. Copy control file, data files, and log files to another directory
(but not tempfile).
4. Restart a database.
5. Create a temporary table and insert into it, thereby causing tempfile
to be updated.
6. Shutdown a database.
7. Restore a database.
8. Restart a database.
9. Create a temporary table and insert into it.
10. Commit
^- ORA-600 [2662]
Details:
ORA-600 [2662] can occur when reusing a TEMPFILE with
a restored database.
Workaround:
The workaround is not to use the pre-existing tempfile.
Instead either backup the tempfile with rest of the database
or remove the tempfile then recreate a new tempfile once the
database is open.
Patch details:
Currently there is no one-off patch available for any platforms and versions

Bug 2054025 (Unpublished)

Abstract:ORA-600 [2662] RELATED TO KDIT.C
Versions affected: 9.0.1.2
Fixed in versions: 9.0.1.3 9.2.0.1
Backportable: No
Symptoms:
OERI:2662 possible on new TEMPORARY index block
Details:
ORA-600 [2662] possible on new TEMPORARY index block
Workaround:
None
Patch details:
Currently there is no one-off patch available for any platforms and versions
Bug 851959
Abstract : ORA-600 [2662] OCCURRED DURING CREATE SNAPSHOT AT MASTER SITE
Details :
It is possible to get ORA-600 [2662] caused by mis-adjustment of the Oracle7 SCN (in PARALLEL SERVER mode) when an Oracle8 instance selects from
it over a DBLINK
Version affected : 7.3.4.X
Fixed in version: 7.3.4.5
Workaround :
None
Patch details :
Currently there is no one-off patch available for any versions/platforms.
Bug 647927 (Unpublished)
Abstract : LOCK PROCESS DIES WITH ORA-600 [2662], [0], [40057943], [0], [40063994]
Version affected 8.0.4.X
Fixed in version : 8.0.4.2 8.0.5.0
Symptoms :
Digital Unix ONLY: OERI:2662 could occur under heavy load
Workaround :
None
Patch details :
Currently there is no one-off patch available for any versions/platforms.
Bug 5612217 (Unpublished)
Abstract : ORA-7445 [KDKBIN] LEADING TO ORA-600 [2662] DUE TO BUFFER CORRUPTION
Version affected : 9.2.0.X
Workaround :
None
Patch details :
One-off patch available for few platforms on top of 9.2.0.7
Check the Metalink for Patch 5612217 availability.
Bug 4599505 (Unpublished)
Abstract : ORA-600 [2662] error
Version affected : 10.2.0.X
Fixed in version : 11.0
Symptoms :
ORA-600[2662] after flashback database.
Workaround :
This problem may disappear by itself after the database has been opened for a while and its SCN has passed the SCN of the problematic block. This is
however not a guaranteed workaround
Patch details :
Currently there is no one-off patch available for any versions/platforms.
Bug 2998110
Abstract :ORA-600 [2662] LARGE QUERIES ON STANDBY WITH LOCALLY MANAGED TMP TBLSP
Version affected : 9.2.0.X 10.1.0.X
Fixed in version : 10.2
Symptoms :
The scn of the tempfiles is advanced but not on any other files
when the database is opened in read only mode.
Workaround :
1) Increase the sort_area_size to avoid sort on disk thus avoiding the use of the tempfiles
–OR–
2) After opening the database read only and BEFORE executing any queries
against the standby database, drop and recreate the tempfiles.
–OR–
3) If you are on 10.1 release you can set the following parameter:
_init_tempfile_on_open=TRUE
in your init.ora/spfile and bounce the database.
Setting this parameter will clear all tempfile bitmaps when the database is opened
so the database open may be take a little longer.
Patch details :
Currently there is no one-off patch available for any versions/platforms.
This bug is fixed in 10.2 and is not backportable to previous releases.
Note 356583.1 has been linked to this scenario.
Bug 3517013 (Unpublished)
Abstract :OPEN DB RESETLOG AFTER FLASHBACK DB FAILS ORA-600 [KCLCHKBLK_4], [1904]
Symptoms :
1) When restored the database from backup and did an incomplete recovery.
2) Opened the database with resetlogs.
3) After opening the database, you start getting following errors:
ORA-00600 [kclchkblk_4]
ORA-00600 [2662]
4) Stack trace is:- kclchkblk kcbzib kcbgcur ktfbhget ktftfcload
Cause :
1)
Error, ORA-600[KCLCHKBLK_4], is signaled because the SCN in a tempfile block
is too high. The same reason caused the ORA-600[2662]s in the alert logs.
2)
This issue is because the tempfiles may not get reinitialized during open
resetlogs.
Patch details :
Currently there is no one-off patch available for any versions/platforms.
Note 275902.1 has been linked to this scenario and solution
given under this note.

Many other bugs were filed with development for this issue.
Those bugs are not progressed due to
— Lack of response from the customers
— one-time occurances
— Vendor OS Problem

【Oracle ASM】ORA-15196: invalid ASM block header [kfc.c:9194] [check_kfbh]エラ解決

 

プロのOracle Databaseの復旧サービスを提供
携帯番号: +86 13764045638 メール:service@parnassusdata.com

このトラブルの情報は以下の通り:

  1. 2より古いRAC環境で起こる
  2. あるmountされたdiskgroupからディスクを増やすあるいはドロップする時にlogでORA-15196エラが起こった。
  3. 一般的にlogでblk=2と示す。つまり、block number=2のmetadata blockがchecksumエラになった

 

例えば:

 

WARNING: cache read a corrupted block gn=27 dsk=3 blk=2 from disk 3NOTE: a corrupted block was dumped to /oracle/product/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_551.trcERROR: cache failed to read gn=27 dsk=3 blk=2 from disk(s): 3ORA-15196: invalid ASM block header [kfc.c:9194] [check_kfbh] [2147483651] [2] [2158748224 != 4194727149]System State dumped to trace file /oracle/product/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_551.trcNOTE: cache initiating offline of disk 3 group 27

【Oracle ASM】ORA-15196 invalid ASM block header [kfc.c9194] [check_kfbh]エラ解析

  1. ここでは一般的にはBLK=2が割り当てテーブルを表し
  2. もしblk=#2ではなければ、この文で紹介した現場と相違が現れる。

 

16進数でダンプすれば第三のブロックにetoVのような文字が現れる。例えば:

 

dd if=/tmp/etoV.dd bs=4096 skip=2 count=1 | hexdump -C | grep etoV 1+0 records in1+0 records out4096 bytes (4.1 kB) copied, 1.6e-05 seconds, 256 MB/s00000600 65 74 6f 56 03 00 00 00 01 03 0b 01 00 00 00 00 |etoV…………| あるいは $ dd if=/tmp/etoV.dd bs=4096 skip=2 count=1 | od -t xz | grep etoV1+0 records in1+0 records out4096 bytes (4.1 kB) copied, 1.7e-05 seconds, 241 MB/s0003000 566f7465 00000003 010b0301 00000000 >etoV…………<

 

 

トラブルが原因はOSレベルのディスクパス配置エラかもしれない。システムを再起動した後によく現れる。その影響はCRSが起動するときにASMディスクをvote disk deviceと見なして、vote disk情報を間違えた設備に書き込む。

 

このトラブルに対して、high/normal redundancyの場合は解決しやすいが、 EXTERNAL redundancyの場合なら、人工的にpatch asm diskでリカバリする必要がある。

 

 

 

沪ICP备14014813号-2

沪公网安备 31010802001379号