解决DGMGRL Unable to connect to database ORA-12541: TNS:no listener问题

晚上配了一套11.2.0.2 的DataGuard物理备库,准备用DGMGRL做切换测试,结果发现在切换过程中会遇到”Unable to connect to database”的问题,具体日志如下:

 

DGMGRL> switchover to dgogg
Performing switchover NOW, please wait...
New primary database "dgogg" is opening...
Operation requires shutdown of instance "SBDB" on database "sbdb"
Shutting down instance "SBDB"...
ORACLE instance shut down.
Operation requires startup of instance "SBDB" on database "sbdb"
Starting instance "SBDB"...
Unable to connect to database
ORA-12541: TNS:no listener

Failed.
Warning: You are no longer connected to ORACLE.

Please complete the following steps to finish switchover:
        start up instance "SBDB" of database "sbdb"

 

ORA-12541: TNS:no listener显然是因为DGMGRL使用的连接串ConnectIdentifier存在问题,但是在创建DGMGRL的Configuration之前我已经将SERIVCENAME_DGMGLR形式的静态信息添加到listener.ora文件中了,如:

 

DGLSN =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = vrh3.oracle.com)(PORT = 1588))
  )

SID_LIST_DGLSN =
  (SID_LIST =
    (SID_DESC =
      (GLOBAL_DBNAME = DGOGG)
      (ORACLE_HOME = /s01/orabase/product/11.2.0/dbhome_1)
      (SID_NAME = DGOGG)
    )
    (SID_DESC =
      (GLOBAL_DBNAME = DGOGG_DGB)
      (ORACLE_HOME = /s01/orabase/product/11.2.0/dbhome_1)
      (SID_NAME = DGOGG)
    )

    (SID_DESC =
      (GLOBAL_DBNAME = DGOGG_DGMGRL)
      (ORACLE_HOME = /s01/orabase/product/11.2.0/dbhome_1)
      (SID_NAME = DGOGG)
    )
  )

 

查了半天最后终于发现时因为监听端口的问题,我创建的LISTENER DGLSN使用1588端口,而dgmgrl中的静态连接串默认使用1521端口,如:

 

DGMGRL> show database dgogg

Database - dgogg

  Role:            PRIMARY
  Intended State:  TRANSPORT-ON
  Instance(s):
    DGOGG

Database Status:
SUCCESS

DGMGRL> show database verbose dgogg

Database - dgogg

  Role:            PRIMARY
  Intended State:  TRANSPORT-ON
  Instance(s):
    DGOGG

  Properties:
    DGConnectIdentifier             = 'dgogg_dgb'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'sync'
    DelayMins                       = '0'
    Binding                         = 'optional'
    MaxFailure                      = '0'
    MaxConnections                  = '1'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    LogShipping                     = 'ON'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyParallel                   = 'AUTO'
    StandbyFileManagement           = 'MANUAL'
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '4'
    LogArchiveMinSucceedDest        = '1'
    DbFileNameConvert               = ''
    LogFileNameConvert              = ''
    FastStartFailoverTarget         = ''
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    SendQEntries                    = '(monitor)'
    LogXptStatus                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    SidName                         = 'DGOGG'
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=vrh3.oracle.com)
    (PORT=1521))(CONNECT_DATA=(SERVICE_NAME=DGOGG_DGMGRL)(INSTANCE_NAME=DGOGG)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = '/s01/orabase/arch'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.dbf'
    TopWaitEvents                   = '(monitor)'

 

了解到问题所在后就容易解决了,只需要修改Broker中的StaticConnectIdentifier就可以了:

 

edit database dgogg set property StaticConnectIdentifier='(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = vrh3.oracle.com)(PORT = 1588)) (CONNECT_DATA =
(SERVER = DEDICATED) (SERVICE_NAME = DGOGG_DGMGRL)))';

edit database sbdb set property StaticConnectIdentifier='(DESCRIPTION =
(ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = vrh4.oracle.com)(PORT = 1588))) (CONNECT_DATA =
(SERVICE_NAME = SBDB_DGMGRL)))';

 

再次尝试切换Switchover DataGuard,不再需要手动启动standby 实例了:

 

DGMGRL> switchover to sbdb;

Performing switchover NOW, please wait...
New primary database "sbdb" is opening...
Operation requires shutdown of instance "DGOGG" on database "dgogg"
Shutting down instance "DGOGG"...
ORACLE instance shut down.
Operation requires startup of instance "DGOGG" on database "dgogg"
Starting instance "DGOGG"...
ORACLE instance started.
Database mounted.
Database opened.
Switchover succeeded, new primary is "sbdb"

DGMGRL> show configuration

Configuration - dgogg

  Protection Mode: MaxAvailability
  Databases:
    sbdb  - Primary database
    dgogg - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

GoldenGate实现Live Standby主备库切换(1)

Oracle Goldengate目前支持主被动式的双向配置,换而言之OGG可以将来自于激活的主库的数据变化完全复制到从库中,从库在不断同步数据的同时已经为计划内的和计划外的outages做好了故障切换的准备,也就是我们说的Live Standby。这里我们重点介绍一下配置Oracle Goldengate Live Standby系统的步骤,和具体的故障切换过程。

SQL> conn clinic/clinic
Connected.
SQL> drop table tv;
create table tv (t1 int primary key,t2 int,t3 varchar2(30));
Table dropped.

SQL> 

Table created.

SQL> drop sequence seqt1;

create sequence seqt1 start with 1 increment by 1;
Sequence dropped.

SQL> SQL>
Sequence created.

declare
  rnd number(9,2);
begin
   for i in 1..100000 loop
     insert into tv values(seqt1.nextval,i*dbms_random.value,'MACLEAN IS TESTING');
     commit;
   end loop;
end;
/

/* 以上脚本在primary主库的某个应用账户下创建了测试用的数据,
    接着我们可以使用各种工具将数据初始化到从库中,如果在这个过程中
    希望实时在线数据迁移的话,可以参考《Goldengate实现在线数据迁移》
*/

/* 注意我们在Live Standby的环境中往往需要复制sequence序列,以保证切换到备库时业务可以正常进行  */

/* 初始化备库数据后,确保已与主库完全一致 */
primary :
SQL> select sum(t2) from tv;

   SUM(T2)
----------
2498624495

SQL> select last_number from user_sequences;

LAST_NUMBER
-----------
     100001

standby:
SQL> select sum(t2) from tv;

   SUM(T2)
----------
2498624495

SQL> select last_number from user_sequences;

LAST_NUMBER
-----------
     100001

以上完成准备工作后,我们可以进入到正式配置Goldengate live stanby的阶段,包括以下步骤:

  1. 配置由主库到备库的extract、replicat、data pump,该步骤同普通的单向复制没有太大的区别
  2. 配置由备库到主库的extract、replicat、data pump
  3. 启动由主库到备库的extract、replicat、data pump

接下来我们会实践整个配置过程:

1.
创建由主库到备库的extract、data pump、replicat
GGSCI (rh2.oracle.com) 10> dblogin userid maclean
Password: 
Successfully logged into database.

GGSCI (rh2.oracle.com) 11> add trandata clinic.*
Logging of supplemental redo data enabled for table CLINIC.TV


GGSCI (rh2.oracle.com) 4> add extract extstd1,tranlog,begin now
EXTRACT added.

GGSCI (rh2.oracle.com) 5> add exttrail /d01/ext/cl,megabytes 100,extract extstd1
EXTTRAIL added.

GGSCI (rh2.oracle.com) 7> view params extstd1

-- Identify the Extract group:
EXTRACT extstd1
-- Specify database login information as needed for the database:
userid maclean, password maclean
-- Specify the local trail that this Extract writes to:
EXTTRAIL /d01/ext/cl
-- Specify sequences to be captured:
SEQUENCE clinic.seqt1;
-- Specify tables to be captured:
TABLE clinic.*;
-- Exclude specific tables from capture if needed:
-- TABLEEXCLUDE 

GGSCI (rh2.oracle.com) 17> add extract pumpstd1,exttrailsource /d01/ext/cl,begin now
EXTRACT added.

GGSCI (rh2.oracle.com) 98> add rmttrail /d01/rmt/cl,megabytes 100,extract pumpstd1
RMTTRAIL added.

GGSCI (rh2.oracle.com) 129> view params pumpstd1
-- Identify the data pump group:
EXTRACT pumpstd1
userid maclean, password maclean
-- Specify database login information as needed for the database:
userid maclean, password maclean
RMTHOST rh3.oracle.com, MGRPORT 7809
-- Specify the remote trail on the standby system:
RMTTRAIL /d01/rmt/cl
-- Pass data through without mapping, filtering, conversion:
PASSTHRU
sequence clinic.seqt1;
Table clinic.*;


在备库上配置由主库到备库的replicat:

GGSCI (rh3.oracle.com) 4> add replicat repstd1,exttrail /d01/rmt/cl,begin now
REPLICAT added.

GGSCI (rh3.oracle.com) 49> view params repstd1
-- Identify the Replicat group:
REPLICAT repstd1
-- State that source and target definitions are identical:
ASSUMETARGETDEFS
-- Specify database login information as needed for the database:
userid maclean, password maclean
-- Specify tables for delivery:
MAP clinic.*, TARGET clinic.*;
-- Exclude specific tables from delivery if needed:
-- MAPEXCLUDE 

2.
创建由备库到主库的extract、data pump、replicat

GGSCI (rh3.oracle.com) 51> dblogin userid maclean
Password: 
Successfully logged into database.

GGSCI (rh3.oracle.com) 52> add trandata clinic.*
Logging of supplemental redo data enabled for table CLINIC.TV.

/* 不要忘记在备库端的相关表加上追加日志 */

GGSCI (rh3.oracle.com) 53> add extract extstd2,tranlog,begin now
EXTRACT added.

GGSCI (rh3.oracle.com) 54> add exttrail /d01/ext/cl,megabytes 100,extract extstd2
EXTTRAIL added.

GGSCI (rh3.oracle.com) 58> view params extstd2
-- Identify the Extract group:
EXTRACT extstd2
-- Specify database login information as needed for the database:
userid maclean, password maclean
-- Specify the local trail that this Extract writes to:
EXTTRAIL /d01/ext/cl
-- Specify sequences to be captured:
SEQUENCE clinic.seqt1;
-- Specify tables to be captured:
TABLE clinic.*;
-- Exclude specific tables from capture if needed:
-- TABLEEXCLUDE 

GGSCI (rh3.oracle.com) 59> add extract pumpstd2,exttrailsource /d01/ext/cl,begin now
EXTRACT added.

GGSCI (rh3.oracle.com) 60> add rmttrail /d01/rmt/cl,megabytes 100,extract pumpstd2
RMTTRAIL added.

GGSCI (rh3.oracle.com) 63> view params pumpstd2

-- Identify the data pump group:
EXTRACT pumpstd2
userid maclean, password maclean
-- Specify database login information as needed for the database:
userid maclean, password maclean
RMTHOST rh2.oracle.com, MGRPORT 7809
-- Specify the remote trail on the standby system:
RMTTRAIL /d01/rmt/cl
-- Pass data through without mapping, filtering, conversion:
PASSTHRU
sequence clinic.seqt1;
Table clinic.*;

在主库上配置replicat:


GGSCI (rh2.oracle.com) 136> add replicat repstd2,exttrail /d01/rmt/cl,begin now,checkpointtable maclean.ck
REPLICAT added.

GGSCI (rh2.oracle.com) 138> view params repstd2

-- Identify the Replicat group:
REPLICAT repstd2
-- State that source and target definitions are identical:
ASSUMETARGETDEFS
-- Specify database login information as needed for the database:
userid maclean, password maclean
-- Specify tables for delivery:
MAP clinic.*, TARGET clinic.*;
-- Exclude specific tables from delivery if needed:
-- MAPEXCLUDE 

3.
完成以上OGG配置后,可以启动主库到备库的extract、pump、以及replicat:
GGSCI (rh2.oracle.com) 141> start extstd1
Sending START request to MANAGER ...
EXTRACT EXTSTD1 starting


GGSCI (rh2.oracle.com) 142> start pumpstd1
Sending START request to MANAGER ...
EXTRACT PUMPSTD1 starting



GGSCI (rh3.oracle.com) 70> start repstd1
Sending START request to MANAGER ...
REPLICAT REPSTD1 starting

/* 如果你是在offline状态下配置的话,那么此时可以启用应用了*/

接下来我们尝试做有计划的主备库切换演练:

1.
首先停止一切在主库上的应用,这一点和DataGuard Switchover一样。在保证没有活动事务的情况下,才能切换干净。
2.
在主库端使用LAG等命令了解extract的延迟,若返回如"At EOF, no more records to process"的信息,则说明所有事务均已被抽取。
GGSCI (rh2.oracle.com) 144> lag extstd1
Sending GETLAG request to EXTRACT EXTSTD1 ...
Last record lag: 0 seconds.
At EOF, no more records to process.

在EOF的前提下关闭extract:
GGSCI (rh2.oracle.com) 146> stop extstd1 
Sending STOP request to EXTRACT EXTSTD1 ...
Request processed.

3.
同样对pump使用LAG命令,若返回如"At EOF, no more records to process"的信息,则说明已抽取的数据都被发送到备库了。
GGSCI (rh2.oracle.com) 147> lag pumpstd1
Sending GETLAG request to EXTRACT PUMPSTD1 ...
Last record lag: 3 seconds.
At EOF, no more records to process.

在EOF的前提下,关闭data pump
GGSCI (rh2.oracle.com) 148> stop pumpstd1
Sending STOP request to EXTRACT PUMPSTD1 ...
Request processed.

3.
检查备库端replicat的同步情况,如返回"At EOF, no more records to process.",则说明所有记录均被复制。
GGSCI (rh3.oracle.com) 71> lag repstd1
Sending GETLAG request to REPLICAT REPSTD1 ...
Last record lag: 5 seconds.
At EOF, no more records to process.

在EOF的前提下关闭replicat

GGSCI (rh3.oracle.com) 72> stop repstd1
Sending STOP request to REPLICAT REPSTD1 ...
Request processed.

4.
紧接着我们可以在备库上为业务应用用户赋予必要的insert、update、delete权限,启用各种触发器trigger及cascade delete约束等;
以上手段在主库上对应的操作是收回应用业务的权限,disable掉各种触发器及cascade delete约束,
之所以这样做是为了保证在任何时候扮演备库角色的数据库均不应当接受任何除了OGG外的手动的或者应用驱动的业务数据变更,
以保证主备库间的数据一致。

5.
修改原备库上的extract的启动时间到现在,已保证它不去抽取那些之前的重做日志

GGSCI (rh3.oracle.com) 75> alter extstd2 ,begin now
EXTRACT altered.

GGSCI (rh3.oracle.com) 76> start extstd2

Sending START request to MANAGER ...
EXTRACT EXTSTD2 starting


若之前没有启动由备库到主库的pump和replicat的话可以在此时启动:

GGSCI (rh3.oracle.com) 78> start pumpstd2

Sending START request to MANAGER ...
EXTRACT PUMPSTD2 starting

GGSCI (rh2.oracle.com) 161> start repstd2

Sending START request to MANAGER ...
REPLICAT REPSTD2 starting

6.此时我们可以正式启动在原备库现在的主库上的应用了


接下来我们尝试回切到原主库上:
1.前提步骤与之前的切换相似,首先停止在原备库上的任何应用,
之后使用LAG命令确认extract和replicat的进度,在确认后关闭extract和replicat。
完成在主库上的维护工作:包括赋予权限,启用触发器等等。

2.修改原主库上的extract的开始时间为当前,保证它不去处理之前的重做日志:
GGSCI (rh2.oracle.com) 165> alter extract extstd1,begin now
EXTRACT altered.

3.此时我们已经可以启动在原主库现在的主库上的应用了

4.启动最早配置的由主库到备库的extract、pump、replicat:

GGSCI (rh2.oracle.com) 166> start extstd1

Sending START request to MANAGER ...
EXTRACT EXTSTD1 starting

GGSCI (rh2.oracle.com) 171> start pumpstd1

Sending START request to MANAGER ...
EXTRACT PUMPSTD1 starting

GGSCI (rh3.oracle.com) 86> start repstd1

Sending START request to MANAGER ...
REPLICAT REPSTD1 starting


以上完成了OGG的Live Standby中主备库之间的计划内的切换Switchover,That's Great!

沪ICP备14014813号-2

沪公网安备 31010802001379号