很多朋友一直对DataGuard的fast-start failover持怀疑的态度;通过observer的观察,(Fast-Start Failover)FSFO提供了一种在primary数据库不可用情况下自动故障切换到standby数据库的能力。造成很多朋友不愿意使用FSFO的原因之一是故障切换后不得不重建原primary数据库,不过如果我们能配合使用10g中的闪回数据库特性的话,这一重建数据库的工作便也可以一劳永逸了。
/* 启用fast_start failover的前提之一是在primary和standby库上都启用flashback database */ SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production PL/SQL Release 11.2.0.2.0 - Production CORE 11.2.0.2.0 Production TNS for Linux: Version 11.2.0.2.0 - Production NLSRTL Version 11.2.0.2.0 - Production /* 当前的primary库情况 */ SQL> select open_mode ,database_role,flashback_on from v$database; OPEN_MODE DATABASE_ROLE FLASHBACK_ON -------------------- ---------------- ------------------ READ WRITE PRIMARY YES SQL> alter session set nls_date_format='YYYY-MM-DD hh24:mi:ss'; Session altered. SQL> select OLDEST_FLASHBACK_TIME from v$flashback_database_log; OLDEST_FLASHBACK_TI ------------------- 2011-02-19 22:40:39 /* 当前的standby库情况 */ SQL> select open_mode ,database_role,flashback_on from v$database; OPEN_MODE DATABASE_ROLE FLASHBACK_ON -------------------- ---------------- ------------------ READ ONLY WITH APPLY PHYSICAL STANDBY YES SQL> alter session set nls_date_format='YYYY-MM-DD hh24:mi:ss'; Session altered. SQL> select OLDEST_FLASHBACK_TIME from v$flashback_database_log; OLDEST_FLASHBACK_TI ------------------- 2011-02-19 22:34:56 DGMGRL> edit configuration set property faststartfailoverthreshold=10; Property "faststartfailoverthreshold" updated DGMGRL> show configuration verbose; Configuration - pro Protection Mode: MaxAvailability Databases: sbdb - Primary database prod - Physical standby database Fast-Start Failover: DISABLED Configuration Status: SUCCESS /* 另开一个终端启动observer */ DGMGRL> start observer Observer started DGMGRL> show fast_start failover Fast-Start Failover: DISABLED Threshold: 10 seconds Target: (none) Observer: rh3.oracle.com Lag Limit: 30 seconds Shutdown Primary: TRUE Auto-reinstate: TRUE Configurable Failover Conditions Health Conditions: Corrupted Controlfile YES Corrupted Dictionary YES Inaccessible Logfile NO Stuck Archiver NO Datafile Offline YES Oracle Error Conditions: (none) /* 启动快速故障切换 */ DGMGRL> enable fast_start failover Enabled. /* 记录实际切换前的时间 */ [maclean@rh3 ~]$ date Sat Feb 19 23:35:44 CST 2011 [maclean@rh2 ~]$ ps -ef|grep pmon|grep -v grep maclean 25165 1 0 22:58 ? 00:00:00 ora_pmon_SBDB maclean 26080 1 0 23:04 ? 00:00:00 ora_pmon_PROD [maclean@rh2 ~]$ kill -9 25165 /* 通过kill pmon进程造成主库crash,引发fast-start failover */ /* observer观察到primary库意外终止后,实施了快速故障切换 */ 23:37:27.92 Saturday, February 19, 2011 Initiating Fast-Start Failover to database "prod"... Performing failover NOW, please wait... Failover succeeded, new primary is "prod" 23:37:33.69 Saturday, February 19, 2011 23:39:17.98 Saturday, February 19, 2011 Initiating reinstatement for database "sbdb"... Reinstating database "sbdb", please wait... Error: ORA-16653: failed to reinstate database Failed. Reinstatement of database "sbdb" failed 23:39:26.30 Saturday, February 19, 2011 23:40:05.00 Saturday, February 19, 2011 Initiating reinstatement for database "sbdb"... Reinstating database "sbdb", please wait... Error: ORA-16653: failed to reinstate database Failed. Reinstatement of database "sbdb" failed 23:40:09.24 Saturday, February 19, 2011 /* 接下来我们通过闪回数据库来将失败切换后的primary库flashback到faliover之前以便重用*/ SQL> startup mount; SQL> flashback database to timestamp to_timestamp('2011-02-19 23:30:44','YYYY-MM-DD hh24:mi:ss'); flashback database to timestamp to_timestamp('2011-02-19 23:35:44','YYYY-MM-DD hh24:mi:ss') * ERROR at line 1: ORA-38754: FLASHBACK DATABASE not started; required redo log is not available ORA-38762: redo logs needed for SCN 1143987 to SCN End-of-Redo ORA-38761: redo log sequence 10 in thread 1, incarnation 3 could not be accessed /* 缺少最近的归档日志而无法完成闪回,需要到目前的primary库中查找 */ SQL> select name from v$archived_log where FIRST_CHANGE#<1143987 and NEXT_CHANGE#>1143987; NAME -------------------------------------------------------------------------------- /s01/fast_recovery_area/PROD/archivelog/2011_02_19/o1_mf_1_10_6ozpzh9c_.arc SQL> alter database register physical logfile '/s01/fast_recovery_area/PROD/archivelog/2011_02_19/o1_mf_1_10_6ozpzh9c_.arc'; Database altered. SQL> flashback database to timestamp to_timestamp('2011-02-19 23:30:44','YYYY-MM-DD hh24:mi:ss'); Flashback complete. /* 成功闪回后,observer将主动去尝试reinstate这个目前可用的standby库,如以下日志*/ 23:46:20.16 Saturday, February 19, 2011 Initiating reinstatement for database "sbdb"... Reinstating database "sbdb", please wait... Operation requires shutdown of instance "SBDB" on database "sbdb" Shutting down instance "SBDB"... ORA-01109: database not open Database dismounted. ORACLE instance shut down. Operation requires startup of instance "SBDB" on database "sbdb" Starting instance "SBDB"... ORACLE instance started. Database mounted. Continuing to reinstate database "sbdb" ... Reinstatement of database "sbdb" succeeded 23:47:13.49 Saturday, February 19, 2011 DGMGRL> show configuration verbose Configuration - pro Protection Mode: MaxAvailability Databases: prod - Primary database sbdb - (*) Physical standby database (*) Fast-Start Failover target Fast-Start Failover: ENABLED Threshold: 10 seconds Target: sbdb Observer: rh3.oracle.com Lag Limit: 30 seconds (not in use) Shutdown Primary: TRUE Auto-reinstate: TRUE Configuration Status: SUCCESS /*可以通过show configuration命令监控到当前primary与standby库都处于可用状态, 且fast_start failover也为启用状态 */