Problem Description:
ORA-16038: log 2 sequence# 13831 cannot be archived
ORA-00354: corrupt redo log block header
ORA-00312: online log 2 thread 1: ‘/oradata/3/TOOLS/stdby_redo/srl1.log’
LOG FILE --------------- Filename = alert_TOOLS5_from_1021.log See ... ... Wed Oct 28 11:41:59 2009 Primary database is in MAXIMUM AVAILABILITY mode Standby controlfile consistent with primary RFS[1]: Successfully opened standby log 1: '/oradata/3/TOOLS/stdby_redo/srl0.log' Wed Oct 28 11:42:00 2009 ARC0: Log corruption near block 604525 change 10551037679542 time ? Wed Oct 28 11:42:00 2009 Errors in file /tools/oracle/admin/TOOLS/bdump/tools_arc0_2143.trc: ORA-00354: corrupt redo log block header ORA-00353: log corruption near block 604525 change 10551037679542 time 10/28/2009 11:29:50 ORA-00312: online log 2 thread 1: '/oradata/3/TOOLS/stdby_redo/srl1.log' ARC0: All Archive destinations made inactive due to error 354 Wed Oct 28 11:42:00 2009 ARC0: Closing local archive destination LOG_ARCHIVE_DEST_2: '/oradata/3/TOOLS/archive/dgarc/1_13831_635534096.arc' (error 354) (TOOLS) Committing creation of archivelog '/oradata/3/TOOLS/archive/dgarc/1_13831_635534096.arc' (error 354) ARCH: Archival stopped, error occurred. Will continue retrying Wed Oct 28 11:42:05 2009 ORACLE Instance TOOLS - Archival Error Wed Oct 28 11:42:05 2009 ORA-16038: log 2 sequence# 13831 cannot be archived ORA-00354: corrupt redo log block header ORA-00312: online log 2 thread 1: '/oradata/3/TOOLS/stdby_redo/srl1.log' Wed Oct 28 11:42:05 2009 Errors in file /tools/oracle/admin/TOOLS/bdump/tools_arc0_2143.trc: ORA-16038: log 2 sequence# 13831 cannot be archived ORA-00354: corrupt redo log block header ORA-00312: online log 2 thread 1: '/oradata/3/TOOLS/stdby_redo/srl1.log' Wed Oct 28 11:43:04 2009 ARCH: Archival stopped, error occurred. Will continue retrying Wed Oct 28 11:43:04 2009 ORACLE Instance TOOLS - Archival Error Wed Oct 28 11:43:04 2009 Primary database is in MAXIMUM AVAILABILITY mode Changing standby controlfile to RESYNCHRONIZATION level Wed Oct 28 11:43:04 2009 ORA-16014: log 1 sequence# 13832 not archived, no available destinations ORA-00312: online log 1 thread 1: '/oradata/3/TOOLS/stdby_redo/srl0.log' Wed Oct 28 11:43:04 2009 Errors in file /tools/oracle/admin/TOOLS/bdump/tools_arc1_2145.trc: ORA-16014: log 1 sequence# 13832 not archived, no available destinations ORA-00312: online log 1 thread 1: '/oradata/3/TOOLS/stdby_redo/srl0.log' RFS[1]: Successfully opened standby log 2: '/oradata/3/TOOLS/stdby_redo/srl1.log' Wed Oct 28 11:43:13 2009 RFS[3]: Archived Log: '/oradata/3/TOOLS/archive/dgarc/1_13831_635534096.arc' Wed Oct 28 11:43:14 2009 RFS LogMiner: Registered logfile [/oradata/3/TOOLS/archive/dgarc/1_13831_635534096.arc] to LogMiner session id [4] Wed Oct 28 11:43:15 2009 LOGMINER: Begin mining logfile for session 4 thread 1 sequence 13831, /oradata/3/TOOLS/archive/dgarc/1_13831_635534096.arc Wed Oct 28 11:44:03 2009 RFS[3]: Archived Log: '/oradata/3/TOOLS/archive/dgarc/1_13832_635534096.arc' ... LOG FILE --------------- Filename = alert_TOOLS6_from_1021.log See ... ... Wed Oct 28 11:16:01 2009 Thread 1 advanced to log sequence 13830 (LGWR switch) Current log# 8 seq# 13830 mem# 0: /oradata/1/redo/TOOLS/redo1a.log Current log# 8 seq# 13830 mem# 1: /oradata/2/redo/TOOLS/redo1b.log Current log# 8 seq# 13830 mem# 2: /oradata/3/redo/TOOLS/redo1c.log Wed Oct 28 11:29:50 2009 LGWR: Standby redo logfile selected to archive thread 1 sequence 13831 LGWR: Standby redo logfile selected for thread 1 sequence 13831 for destination LOG_ARCHIVE_DEST_2 Wed Oct 28 11:29:50 2009 Thread 1 advanced to log sequence 13831 (LGWR switch) Current log# 9 seq# 13831 mem# 0: /oradata/1/redo/TOOLS/redo2a.log Current log# 9 seq# 13831 mem# 1: /oradata/2/redo/TOOLS/redo2b.log Current log# 9 seq# 13831 mem# 2: /oradata/3/redo/TOOLS/redo2c.log Wed Oct 28 11:41:59 2009 LGWR: Standby redo logfile selected to archive thread 1 sequence 13832 LGWR: Standby redo logfile selected for thread 1 sequence 13832 for destination LOG_ARCHIVE_DEST_2 Wed Oct 28 11:41:59 2009 Thread 1 advanced to log sequence 13832 (LGWR switch) Current log# 10 seq# 13832 mem# 0: /oradata/1/redo/TOOLS/redo3a.log Current log# 10 seq# 13832 mem# 1: /oradata/2/redo/TOOLS/redo3b.log Current log# 10 seq# 13832 mem# 2: /oradata/3/redo/TOOLS/redo3c.log Wed Oct 28 11:43:04 2009 Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED LGWR: Standby redo logfile selected to archive thread 1 sequence 13833 LGWR: Standby redo logfile selected for thread 1 sequence 13833 for destination LOG_ARCHIVE_DEST_2 Wed Oct 28 11:43:04 2009 Thread 1 advanced to log sequence 13833 (LGWR switch) Current log# 11 seq# 13833 mem# 0: /oradata/1/redo/TOOLS/redo4a.log Current log# 11 seq# 13833 mem# 1: /oradata/2/redo/TOOLS/redo4b.log Current log# 11 seq# 13833 mem# 2: /oradata/3/redo/TOOLS/redo4c.log Wed Oct 28 11:45:04 2009 Destination LOG_ARCHIVE_DEST_2 is SYNCHRONIZED LGWR: Standby redo logfile selected to archive thread 1 sequence 13834 LGWR: Standby redo logfile selected for thread 1 sequence 13834 for destination LOG_ARCHIVE_DEST_2 Wed Oct 28 11:45:05 2009 Thread 1 advanced to log sequence 13834 (LGWR switch) Current log# 8 seq# 13834 mem# 0: /oradata/1/redo/TOOLS/redo1a.log Current log# 8 seq# 13834 mem# 1: /oradata/2/redo/TOOLS/redo1b.log Current log# 8 seq# 13834 mem# 2: /oradata/3/redo/TOOLS/redo1c.log Wed Oct 28 11:46:03 2009 Thread 1 cannot allocate new log, sequence 13835 Checkpoint not complete Current log# 8 seq# 13834 mem# 0: /oradata/1/redo/TOOLS/redo1a.log Current log# 8 seq# 13834 mem# 1: /oradata/2/redo/TOOLS/redo1b.log Current log# 8 seq# 13834 mem# 2: /oradata/3/redo/TOOLS/redo1c.log Wed Oct 28 11:46:10 2009 Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED LGWR: Standby redo logfile selected to archive thread 1 sequence 13835 LGWR: Standby redo logfile selected for thread 1 sequence 13835 for destination LOG_ARCHIVE_DEST_2 Wed Oct 28 11:46:11 2009 Thread 1 advanced to log sequence 13835 (LGWR switch) Current log# 9 seq# 13835 mem# 0: /oradata/1/redo/TOOLS/redo2a.log Current log# 9 seq# 13835 mem# 1: /oradata/2/redo/TOOLS/redo2b.log Current log# 9 seq# 13835 mem# 2: /oradata/3/redo/TOOLS/redo2c.log Wed Oct 28 11:48:03 2009 Thread 1 cannot allocate new log, sequence 13836 Checkpoint not complete Current log# 9 seq# 13835 mem# 0: /oradata/1/redo/TOOLS/redo2a.log Current log# 9 seq# 13835 mem# 1: /oradata/2/redo/TOOLS/redo2b.log Current log# 9 seq# 13835 mem# 2: /oradata/3/redo/TOOLS/redo2c.log Wed Oct 28 11:48:06 2009 ... From the standby, as at 2009-10-28, 11:42, when the archiver tried to archive the standby redo logfile. it encountered this error: ORA-00354: corrupt redo log block header ORA-00353: log corruption near block 604525 change 10551037679542 time 10/28/2009 11:29:50 ORA-00312: online log 2 thread 1: '/oradata/3/TOOLS/stdby_redo/srl1.log' Errors in file /tools/oracle/admin/TOOLS/bdump/tools_arc0_2143.trc
The real logfile is retrieved from primary by the standby RFS process, then the log apply continue as usual.
The fact that the standby redo logs are corrupted and identified as corrupt by the ARC process , makes it clear that there could be some sort of I/O errors which has caused.
Reviewing the alert.log file it is clear that the RFS process fetched the new copy of the file which is corrupted and the issue has been resolved.
This is more an issue to be concentrated from the system adminisration end to determine in case there are any issues at the I.O subsystem.
list some Script to Collect Data Guard Primary Site Diagnostic Information:
Overview
——–
This script is intended to provide an easy method to provide information
necessary to troubleshoot Data Guard issues.Script Notes
————-
This script is intended to be run via sqlplus as the SYS or Internal user.Script
——-
– – – – – – – – – – – – – – – – Script begins here – – – – – – – – – – – – – – – –— NAME: dg_prim_diag.sql (Run on PRIMARY with a LOGICAL or PHYSICAL STANDBY)
— ————————————————————————
— Copyright 2002, Oracle Corporation
— LAST UPDATED: 2/23/04
—
— Usage: @dg_prim_diag
— ————————————————————————
— PURPOSE:
— This script is to be used to assist in collection information to help
— troubeshoot Data Guard issues with an emphasis on Logical Standby.
— ————————————————————————
— DISCLAIMER:
— This script is provided for educational purposes only. It is NOT
— supported by Oracle World Wide Technical Support.
— The script has been tested and appears to work as intended.
— You should always run new scripts on a test instance initially.
— ————————————————————————
— Script output is as follows:set echo off
set feedback off
column timecol new_value timestamp
column spool_extension new_value suffix
select to_char(sysdate,’Mondd_hhmi’) timecol,
‘.out’ spool_extension from sys.dual;
column output new_value dbname
select value || ‘_’ output
from v$parameter where name = ‘db_name’;
spool dg_prim_diag_&&dbname&×tamp&&suffix
set linesize 79
set pagesize 35
set trim on
set trims on
alter session set nls_date_format = ‘MON-DD-YYYY HH24:MI:SS’;
set feedback on
select to_char(sysdate) time from dual;set echo on
— In the following the database_role should be primary as that is what
— this script is intended to be run on. If protection_level is different
— than protection_mode then for some reason the mode listed in
— protection_mode experienced a need to downgrade. Once the error
— condition has been corrected the protection_level should match the
— protection_mode after the next log switch.column role format a7 tru
column name format a10 wrapselect name,database_role role,log_mode,
protection_mode,protection_level
from v$database;— ARCHIVER can be (STOPPED | STARTED | FAILED). FAILED means that the
— archiver failed to archive a log last time, but will try again within 5
— minutes. LOG_SWITCH_WAIT The ARCHIVE LOG/CLEAR LOG/CHECKPOINT event log
— switching is waiting for. Note that if ALTER SYSTEM SWITCH LOGFILE is
— hung, but there is room in the current online redo log, then value is
— NULLcolumn host_name format a20 tru
column version format a9 truselect instance_name,host_name,version,archiver,log_switch_wait
from v$instance;— The following query give us information about catpatch.
— This way we can tell if the procedure doesn’t match the image.select version, modified, status from dba_registry
where comp_id = ‘CATPROC’;— Force logging is not mandatory but is recommended. Supplemental
— logging must be enabled if the standby associated with this primary is
— a logical standby. During normal operations it is acceptable for
— SWITCHOVER_STATUS to be SESSIONS ACTIVE or TO STANDBY.column force_logging format a13 tru
column remote_archive format a14 tru
column dataguard_broker format a16 truselect force_logging,remote_archive,
supplemental_log_data_pk,supplemental_log_data_ui,
switchover_status,dataguard_broker
from v$database;— This query produces a list of all archive destinations. It shows if
— they are enabled, what process is servicing that destination, if the
— destination is local or remote, and if remote what the current mount ID
— is.column destination format a35 wrap
column process format a7
column archiver format a8
column ID format 99
column mid format 99select dest_id “ID”,destination,status,target,
schedule,process,mountid mid
from v$archive_dest order by dest_id;— This select will give further detail on the destinations as to what
— options have been set. Register indicates whether or not the archived
— redo log is registered in the remote destination control file.set numwidth 8
column ID format 99select dest_id “ID”,archiver,transmit_mode,affirm,async_blocks async,
net_timeout net_time,delay_mins delay,reopen_secs reopen,
register,binding
from v$archive_dest order by dest_id;— The following select will show any errors that occured the last time
— an attempt to archive to the destination was attempted. If ERROR is
— blank and status is VALID then the archive completed correctly.column error format a55 wrap
select dest_id,status,error from v$archive_dest;
— The query below will determine if any error conditions have been
— reached by querying the v$dataguard_status view (view only available in
— 9.2.0 and above):column message format a80
select message, timestamp
from v$dataguard_status
where severity in (‘Error’,’Fatal’)
order by timestamp;— The following query will determine the current sequence number
— and the last sequence archived. If you are remotely archiving
— using the LGWR process then the archived sequence should be one
— higher than the current sequence. If remotely archiving using the
— ARCH process then the archived sequence should be equal to the
— current sequence. The applied sequence information is updated at
— log switch time.select ads.dest_id,max(sequence#) “Current Sequence”,
max(log_sequence) “Last Archived”
from v$archived_log al, v$archive_dest ad, v$archive_dest_status ads
where ad.dest_id=al.dest_id
and al.dest_id=ads.dest_id
group by ads.dest_id;— The following select will attempt to gather as much information as
— possible from the standby. SRLs are not supported with Logical Standby
— until Version 10.1.set numwidth 8
column ID format 99
column “SRLs” format 99
column Active format 99select dest_id id,database_mode db_mode,recovery_mode,
protection_mode,standby_logfile_count “SRLs”,
standby_logfile_active ACTIVE,
archived_seq#
from v$archive_dest_status;— Query v$managed_standby to see the status of processes involved in
— the shipping redo on this system. Does not include processes needed to
— apply redo.select process,status,client_process,sequence#
from v$managed_standby;— The following query is run on the primary to see if SRL’s have been
— created in preparation for switchover.select group#,sequence#,bytes from v$standby_log;
— The above SRL’s should match in number and in size with the ORL’s
— returned below:select group#,thread#,sequence#,bytes,archived,status from v$log;
— Non-default init parameters.
set numwidth 5
column name format a30 tru
column value format a48 wra
select name, value
from v$parameter
where isdefault = ‘FALSE’;spool off
– – – – – – – – – – – – – – – – Script ends here – – – – – – – – – – – – – – – –
another one:
Overview
——–This script is intended to provide an easy method to provide information
necessary to troubleshoot Data Guard issues.Script Notes
————-This script is intended to be run via sqlplus as the SYS or Internal user.
Script
——-– – – – – – – – – – – – – – – – Script begins here – – – – – – – – – – – – – – – –
— NAME: DG_phy_stby_diag.sql
— ————————————————————————
— AUTHOR:
— Michael Smith – Oracle Support Services – DataServer Group
— Copyright 2002, Oracle Corporation
— ————————————————————————
— PURPOSE:
— This script is to be used to assist in collection information to help
— troubeshoot Data Guard issues.
— ————————————————————————
— DISCLAIMER:
— This script is provided for educational purposes only. It is NOT
— supported by Oracle World Wide Technical Support.
— The script has been tested and appears to work as intended.
— You should always run new scripts on a test instance initially.
— ————————————————————————
— Script output is as follows:set echo off
set feedback off
column timecol new_value timestamp
column spool_extension new_value suffix
select to_char(sysdate,’Mondd_hhmi’) timecol,
‘.out’ spool_extension from sys.dual;
column output new_value dbname
select value || ‘_’ output
from v$parameter where name = ‘db_name’;
spool dgdiag_phystby_&&dbname&×tamp&&suffix
set lines 200
set pagesize 35
set trim on
set trims on
alter session set nls_date_format = ‘MON-DD-YYYY HH24:MI:SS’;
set feedback on
select to_char(sysdate) time from dual;set echo on
—
— ARCHIVER can be (STOPPED | STARTED | FAILED) FAILED means that the archiver failed
— to archive a — log last time, but will try again within 5 minutes. LOG_SWITCH_WAIT
— The ARCHIVE LOG/CLEAR LOG/CHECKPOINT event log switching is waiting for. Note that
— if ALTER SYSTEM SWITCH LOGFILE is hung, but there is room in the current online
— redo log, then value is NULLcolumn host_name format a20 tru
column version format a9 tru
select instance_name,host_name,version,archiver,log_switch_wait from v$instance;— The following select will give us the generic information about how this standby is
— setup. The database_role should be standby as that is what this script is intended
— to be ran on. If protection_level is different than protection_mode then for some
— reason the mode listed in protection_mode experienced a need to downgrade. Once the
— error condition has been corrected the protection_level should match the protection_mode
— after the next log switch.column ROLE format a7 tru
select name,database_role,log_mode,controlfile_type,protection_mode,protection_level
from v$database;— Force logging is not mandatory but is recommended. Supplemental logging should be enabled
— on the standby if a logical standby is in the configuration. During normal
— operations it is acceptable for SWITCHOVER_STATUS to be SESSIONS ACTIVE or NOT ALLOWED.column force_logging format a13 tru
column remote_archive format a14 tru
column dataguard_broker format a16 tru
select force_logging,remote_archive,supplemental_log_data_pk,supplemental_log_data_ui,
switchover_status,dataguard_broker from v$database;— This query produces a list of all archive destinations and shows if they are enabled,
— what process is servicing that destination, if the destination is local or remote,
— and if remote what the current mount ID is. For a physical standby we should have at
— least one remote destination that points the primary set but it should be deferred.COLUMN destination FORMAT A35 WRAP
column process format a7
column archiver format a8
column ID format 99select dest_id “ID”,destination,status,target,
archiver,schedule,process,mountid
from v$archive_dest;— If the protection mode of the standby is set to anything higher than max performance
— then we need to make sure the remote destination that points to the primary is set
— with the correct options else we will have issues during switchover.select dest_id,process,transmit_mode,async_blocks,
net_timeout,delay_mins,reopen_secs,register,binding
from v$archive_dest;— The following select will show any errors that occured the last time an attempt to
— archive to the destination was attempted. If ERROR is blank and status is VALID then
— the archive completed correctly.column error format a55 tru
select dest_id,status,error from v$archive_dest;— Determine if any error conditions have been reached by querying thev$dataguard_status
— view (view only available in 9.2.0 and above):column message format a80
select message, timestamp
from v$dataguard_status
where severity in (‘Error’,’Fatal’)
order by timestamp;— The following query is ran to get the status of the SRL’s on the standby. If the
— primary is archiving with the LGWR process and SRL’s are present (in the correct
— number and size) then we should see a group# active.select group#,sequence#,bytes,used,archived,status from v$standby_log;
— The above SRL’s should match in number and in size with the ORL’s returned below:
select group#,thread#,sequence#,bytes,archived,status from v$log;
— Query v$managed_standby to see the status of processes involved in the
— configuration.select process,status,client_process,sequence#,block#,active_agents,known_agents
from v$managed_standby;— Verify that the last sequence# received and the last sequence# applied to standby
— database.select al.thrd “Thread”, almax “Last Seq Received”, lhmax “Last Seq Applied”
from (select thread# thrd, max(sequence#) almax
from v$archived_log
where resetlogs_change#=(select resetlogs_change# from v$database)
group by thread#) al,
(select thread# thrd, max(sequence#) lhmax
from v$log_history
where first_time=(select max(first_time) from v$log_history)
group by thread#) lh
where al.thrd = lh.thrd;— The V$ARCHIVE_GAP fixed view on a physical standby database only returns the next
— gap that is currently blocking redo apply from continuing. After resolving the
— identified gap and starting redo apply, query the V$ARCHIVE_GAP fixed view again
— on the physical standby database to determine the next gap sequence, if there is
— one.select * from v$archive_gap;
— Non-default init parameters.
set numwidth 5
column name format a30 tru
column value format a50 wra
select name, value
from v$parameter
where isdefault = ‘FALSE’;spool off
– – – – – – – – – – – – – – – – Script ends here – – – – – – – – – – – – – – – –
Sir,
I am new to Data Guard. We have a logical Standby Database with Maximum Availability Setup with Synchronous Log Transfer. I see below message in alert logs quite frequently.
“Changing standby controlfile to RESYNCHRONIZATION level”
Can you kindly let me know or point me to a document which tells under what conditions and situations I get the above message. I have seen during the time when the standby DB is under RESYNCHRONIZATION mode switchover fails.
Any help will be highly appreciable. You can reach me at gaurav.e15@gmail.com
Thanks in advance,
Regards,
Gaurav
for MAXIMUM AVAILABILITY mode,
“Like maximum protection mode, transactions do not commit until all redo data needed to recover those transactions has been written to the online redo log and to at least one synchronized standby database. Unlike maximum protection mode, the primary database will not shut down if a fault prevents it from writing its redo stream to a synchronized standby database. Instead, the primary database will operate in RESYNCHRONIZATION until the fault is corrected and all log gaps have been resolved. When all log gaps have been resolved, the primary database automatically resumes operating in maximum availability mode.”
see metalink TRANSPORT: Data Guard Protection Modes [ID 239100.1], But I am not sure.
Can you run diag script from http://www.oracledatabase12g.com/archives/script-to-collect-data-guard-diagnostic-information.html
and upload or create a service request?