ORACLE dataguard学习笔记

今天在一台机器上模拟了dataguard,主备两个实例从物理上不共享任何归档文件路径。

主要有以下内容:

dataguard Physical standby的创建

protection mode的切换

switch over

 

模拟了两台机器oel1,oel2  主库的归档放在oel1里面,备库的放在oel2里面

创建的路径如下
./oel1:
orcl_pri_arch  orcl_stdby_arch

./oel2:
standby_pri_arch  standby_stdby_arch

--强制logging
SQL> alter database force logging;
alter database force logging
*
ERROR at line 1:
ORA-12920: database is already in force logging mode

SQL> select force_logging from v$database;

FOR
---
YES

--查看归档状态
SQL> archive log list
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     9
Next log sequence to archive   11
Current log sequence           11

--创建密码文件

[oracle@oel1 dbs]$ orapwd file=orapwstandby password=oracle entries=10
[oracle@oel1 dbs]$
listener.ora
LISTENER1 =
  (ADDRESS_LIST =
    (ADDRESS = (PROTOCOL = tcp)(HOST = oel1.oracle.com)(PORT = 1521))
  )

SID_LIST_LISTENER1 =
  (SID_LIST =
    (SID_DESC =
      (GLOBAL_DBNAME = orcl.oracle.com)
      (SID_NAME = orcl)
      (ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_1)
    )
   
     (SID_DESC =
      (GLOBAL_DBNAME = standby.oracle.com)
      (ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_1)
      (SID_NAME = standby)
    )
  )
tnsnames.ora
ORCL =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = tcp)(HOST = oel1.oracle.com)(PORT = 1521))
    (CONNECT_DATA =
      (SERVICE_NAME = orcl.oracle.com)
    )
  )
standby =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = tcp)(HOST = oel1.oracle.com)(PORT = 1521))
    (CONNECT_DATA =
      (SERVICE_NAME = stdby.oracle.com)
    )
  )
--创建相应的备库目录
[oracle@oel1 standby]$ mkdir adump bdump cdump udump
[oracle@oel1 standby]$ ll
total 16
drwxr-xr-x 2 oracle dba 4096 Oct 21 15:27 adump
drwxr-xr-x 2 oracle dba 4096 Oct 21 15:27 bdump
drwxr-xr-x 2 oracle dba 4096 Oct 21 15:27 cdump
drwxr-xr-x 2 oracle dba 4096 Oct 21 15:27 udump

 

参数文件
主库参数文件(ORCL)
db_unique_name='orcl'
log_archive_config='dg_config=(orcl,standby)'
log_archive_dest_1='location=/u01/app/oracle/dg/oel1/orcl_pri_arch valid_for=(ONLINE_LOGFILE,PRIMARY_ROLE) db_unique_name=orcl'
log_archive_dest_2='service=standby valid_for=(ONLINE_LOGFILE,PRIMARY_ROLE) db_unique_name=standby'
log_archive_dest_3='location=/u01/app/oracle/dg/oel1/orcl_stdby_arch valid_for=(STANDBY_LOGFILE,STANDBY_ROLE) db_unique_name=orcl'
db_file_name_convert='/u01/app/oracle/oradata/orcl','/u01/app/oracle/oradata/standby'
log_file_name_convert='/u01/app/oracle/oradata/orcl','/u01/app/oracle/oradata/standby'
standby_file_management=auto

备库参数文件(standby)
db_unique_name='standby'
log_archive_config='dg_config=(orcl,standby)'
log_archive_dest_1='location=/u01/app/oracle/dg/oel2/standby_pri_arch valid_for=(ONLINE_LOGFILE,PRIMARY_ROLE) db_unique_name=standby'
log_archive_dest_2='service=orcl valid_for=(ONLINE_LOGFILE,PRIMARY_ROLE) db_unique_name=orcl'
log_archive_dest_3='location=/u01/app/oracle/dg/oel2/standby_stdby_arch valid_for=(STANDBY_LOGFILE,STANDBY_ROLE) db_unique_name=standby'
db_file_name_convert='/u01/app/oracle/oradata/orcl','/u01/app/oracle/oradata/standby'
log_file_name_convert='/u01/app/oracle/oradata/orcl','/u01/app/oracle/oradata/standby'
standby_file_management=auto

 

 


主库运行
SQL> alter database create standby controlfile as '/u01/app/oracle/dg/standby.ctl';

Database altered.

SQL>

--错误的操作来生成控制文件
[oracle@oel1 standby]$ cp /u01/app/oracle/dg/standby.ctl control01.ctl
[oracle@oel1 standby]$ ll
total 21588
-rw-r----- 1 oracle dba 7356416 Oct 21 16:22 control01.ctl
-rw-r----- 1 oracle dba 7356416 Oct 21 16:23 control02.ctl
-rw-r----- 1 oracle dba 7356416 Oct 21 16:22 control03.ctl

--正确的方法
[oracle@oel1 standby]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.1.0 - Production on Sun Oct 21 16:26:01 2012

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount;
ORACLE instance started.

Total System Global Area  524288000 bytes
Fixed Size                  1220336 bytes
Variable Size             142606608 bytes
Database Buffers          373293056 bytes
Redo Buffers                7168000 bytes
SQL>

[oracle@oel1 standby]$ rman target /

Recovery Manager: Release 10.2.0.1.0 - Production on Sun Oct 21 16:27:44 2012

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

connected to target database: orcl (not mounted)

RMAN> restore controlfile from '/u01/app/oracle/dg/standby.ctl';

Starting restore at 21-OCT-12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=39 devtype=DISK

channel ORA_DISK_1: copied control file copy
output filename=/u01/app/oracle/oradata/standby/control01.ctl
output filename=/u01/app/oracle/oradata/standby/control02.ctl
output filename=/u01/app/oracle/oradata/standby/control03.ctl
Finished restore at 21-OCT-12

RMAN>
[oracle@oel1 standby]$ ll
total 21588
-rw-r----- 1 oracle dba 7356416 Oct 21 16:28 control01.ctl
-rw-r----- 1 oracle dba 7356416 Oct 21 16:28 control02.ctl
-rw-r----- 1 oracle dba 7356416 Oct 21 16:28 control03.ctl
[oracle@oel1 standby]$

将备库启动到Mount状态

--到此为止要保证网络是互通的,可以检查密码文件,listener等的配置是否合适
[oracle@oel1 admin]$ sqlplus sys/oracle@orcl as sysdba

[oracle@oel1 admin]$ sqlplus sys/oracle@standby as sysdba


-standby库 使用下面命令来apply
SQL> alter database recover managed standby database disconnect from session;

Database altered.

--当然也可以取消apply
SQL> alter database recover managed standby database cancel;

Database altered.

--在取消apply的时候,添加standby logfile,强烈建议在apply之前能够创建几组standby logfile
SQL> alter database add standby logfile;

Database altered.


SQL> select member from v$logfile;


/u01/app/oracle/STANDBY/onlinelog/o1_mf_8_88807p3x_.log

 

--在主库切换日志
SQL> alter system switch logfile;

System altered.

SQL>

如果没有看到归档到指定的目录,查看primary database的log_archive_dest参数,很可能是修改了Pfile 之后没有同步到spfile.

 

 

Error 16047 attaching to destination LOG_ARCHIVE_DEST_2 standby host 'standby'
ORA-16047: DGID mismatch between destination setting and standby
*** 2012-10-21 17:09:46.533 60679 kcrr.c
PING[ARC0]: Heartbeat failed to connect to standby 'standby'. Error is 16047.
*** 2012-10-21 17:09:46.533 58941 kcrr.c
kcrrfail: dest:2 err:16047 force:0 blast:1

查找了半天,原来是在参数文件中
db_unique_name='standb'
log_archive_config='dg_config=(orcl,standby)'


在主库中查看到如下的错误

Errors in file /u01/app/oracle/admin/orcl/bdump/orcl_arc0_16479.trc:
ORA-16047: DGID mismatch between destination setting and standby
Sun Oct 21 17:14:46 2012
PING[ARC0]: Heartbeat failed to connect to standby 'standby'. Error is 16047.
Sun Oct 21 17:18:50 2012
Thread 1 advanced to log sequence 15
  Current log# 2 seq# 15 mem# 0: /u01/app/oracle/oradata/orcl/redo02.log
Sun Oct 21 17:18:50 2012
ARC1: Archivelog destination LOG_ARCHIVE_DEST_2 disabled: Data Guard configuration identifier mismatch
可以查看

SQL> select dest_id,error from v$archive_dest;
SQL> select dest_id,error from v$archive_dest;

   DEST_ID ERROR
---------- -----------------------------------------------------------------
         1
         2 ORA-16047: DGID mismatch between destination setting and standby
         3 ORA-16053: DB_UNIQUE_NAME  is not in the Data Guard Configuration
         4
         5
         6
         7
         8
         9
。。
就能查看到错误的原因了。
 ORA-16053: DB_UNIQUE_NAME  is not in the Data Guard Configuration
原因是在主库的参数配置中,db_unique_name
*.log_archive_dest_3='location=/u01/app/oracle/dg/oel1/orcl_stdby_arch valid_for=(STANDBY_LOGFILE,STANDBY_ROLE) db_unique_name=standby'


dataguard 有一个很好的优点就是如果之前的配置错误,导致文件不能正确apply,会向primary database 索取apply点之前的日志文件
如下:
[oracle@oel1 orcl_pri_arch]$ ll
total 2164
-rw-r----- 1 oracle dba  248832 Oct 21 16:46 1_13_796831768.dbf
-rw-r----- 1 oracle dba 1257984 Oct 21 17:18 1_14_796831768.dbf
-rw-r----- 1 oracle dba  690176 Oct 21 17:31 1_15_796831768.dbf
[oracle@oel1 orcl_pri_arch]$

--在主库的归档目录下只有三个文件

--但是在正确配置了参数之后,重启两个实例后,手动切换日志,
--主库日志如下:
Sun Oct 21 17:31:44 2012
Thread 1 advanced to log sequence 16
  Current log# 3 seq# 16 mem# 0: /u01/app/oracle/oradata/orcl/redo03.log
Sun Oct 21 17:32:10 2012
ARC0: Standby redo logfile selected for thread 1 sequence 15 for destination LOG_ARCHIVE_DEST_2

--在备库,中会发现gap.然后索取之前的日志
Completed: alter database recover managed standby database disconnect from session
Sun Oct 21 17:31:36 2012
Using STANDBY_ARCHIVE_DEST parameter default value as /u01/app/oracle/dg/oel2/standby_stdby_arch
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[1]: Assigned to RFS process 19230
RFS[1]: Identified database type as 'physical standby'
Sun Oct 21 17:31:36 2012
RFS LogMiner: Client disabled from further notification
RFS[1]: Archived Log: '/u01/app/oracle/dg/oel2/standby_stdby_arch/1_1_796831768.dbf'
RFS[1]: Archived Log: '/u01/app/oracle/dg/oel2/standby_stdby_arch/1_2_796831768.dbf'
Sun Oct 21 17:31:37 2012
Fetching gap sequence in thread 1, gap sequence 11-13
FAL[client]: Error fetching gap sequence, no FAL server specified
Sun Oct 21 17:31:37 2012
RFS[1]: Archived Log: '/u01/app/oracle/dg/oel2/standby_stdby_arch/1_3_796831768.dbf'
Sun Oct 21 17:31:47 2012
RFS[1]: Archived Log: '/u01/app/oracle/dg/oel2/standby_stdby_arch/1_4_796831768.dbf'
Sun Oct 21 17:31:58 2012
RFS[1]: Archived Log: '/u01/app/oracle/dg/oel2/standby_stdby_arch/1_5_796831768.dbf'
RFS[1]: Archived Log: '/u01/app/oracle/dg/oel2/standby_stdby_arch/1_6_796831768.dbf'
RFS[1]: Archived Log: '/u01/app/oracle/dg/oel2/standby_stdby_arch/1_7_796831768.dbf'
RFS[1]: Archived Log: '/u01/app/oracle/dg/oel2/standby_stdby_arch/1_8_796831768.dbf'
RFS[1]: Archived Log: '/u01/app/oracle/dg/oel2/standby_stdby_arch/1_9_796831768.dbf'
Sun Oct 21 17:32:07 2012
Fetching gap sequence in thread 1, gap sequence 11-14
FAL[client]: Error fetching gap sequence, no FAL server specified
Sun Oct 21 17:32:07 2012
RFS[1]: Archived Log: '/u01/app/oracle/dg/oel2/standby_stdby_arch/1_10_796831768.dbf'
RFS[1]: Archived Log: '/u01/app/oracle/dg/oel2/standby_stdby_arch/1_11_796831768.dbf'
RFS[1]: Archived Log: '/u01/app/oracle/dg/oel2/standby_stdby_arch/1_12_796831768.dbf'
RFS[1]: Archived Log: '/u01/app/oracle/dg/oel2/standby_stdby_arch/1_13_796831768.dbf'
RFS[1]: Archived Log: '/u01/app/oracle/dg/oel2/standby_stdby_arch/1_14_796831768.dbf'
RFS[1]: Successfully opened standby log 4: '/u01/app/oracle/flash_recovery_area/ORCL/onlinelog/o1_mf_4_85x8vjt5_.log'
Sun Oct 21 17:32:37 2012
Media Recovery Log /u01/app/oracle/dg/oel2/standby_stdby_arch/1_11_796831768.dbf
Media Recovery Log /u01/app/oracle/dg/oel2/standby_stdby_arch/1_12_796831768.dbf
Media Recovery Log /u01/app/oracle/dg/oel2/standby_stdby_arch/1_13_796831768.dbf
Media Recovery Log /u01/app/oracle/dg/oel2/standby_stdby_arch/1_14_796831768.dbf
Media Recovery Log /u01/app/oracle/dg/oel2/standby_stdby_arch/1_15_796831768.dbf
Media Recovery Waiting for thread 1 sequence 16

--在备库的归档路径下
[oracle@oel1 standby_stdby_arch]$ ll
total 242948
-rw-r----- 1 oracle dba      4096 Oct 21 17:32 1_10_796831768.dbf
-rw-r----- 1 oracle dba   1609728 Oct 21 17:32 1_11_796831768.dbf
-rw-r----- 1 oracle dba      2048 Oct 21 17:32 1_12_796831768.dbf
-rw-r----- 1 oracle dba    248832 Oct 21 17:32 1_13_796831768.dbf
-rw-r----- 1 oracle dba   1257984 Oct 21 17:32 1_14_796831768.dbf
-rw-r----- 1 oracle dba    690176 Oct 21 17:32 1_15_796831768.dbf
-rw-r----- 1 oracle dba     60416 Oct 21 17:31 1_1_796831768.dbf
-rw-r----- 1 oracle dba      8704 Oct 21 17:31 1_2_796831768.dbf
-rw-r----- 1 oracle dba      5632 Oct 21 17:31 1_3_796831768.dbf
-rw-r----- 1 oracle dba 100657664 Oct 21 17:31 1_4_796831768.dbf
-rw-r----- 1 oracle dba 100661248 Oct 21 17:31 1_5_796831768.dbf
-rw-r----- 1 oracle dba  16238592 Oct 21 17:32 1_6_796831768.dbf
-rw-r----- 1 oracle dba   1735680 Oct 21 17:32 1_7_796831768.dbf
-rw-r----- 1 oracle dba    450560 Oct 21 17:32 1_8_796831768.dbf

再次手动切换日志,一切正常
--从备库中查看
Sun Oct 21 17:38:38 2012
RFS[1]: Successfully opened standby log 4: '/u01/app/oracle/flash_recovery_area/ORCL/onlinelog/o1_mf_4_85x8vjt5_.log'
Sun Oct 21 17:38:41 2012
Media Recovery Log /u01/app/oracle/dg/oel2/standby_stdby_arch/1_16_796831768.dbf
Media Recovery Waiting for thread 1 sequence 17

 

--change protection mode

orcl>select database_role,protection_mode,protection_level from v$database;

DATABASE_ROLE    PROTECTION_MODE      PROTECTION_LEVEL
---------------- -------------------- --------------------
PRIMARY          MAXIMUM PERFORMANCE  MAXIMUM PERFORMANCE

 

SQL> alter database set standby to maximize protection;
alter database set standby to maximize protection
*
ERROR at line 1:
ORA-01126: database must be mounted in this instance and not open in any
instance


orcl>shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
orcl>startup mount;
ORACLE instance started.

Total System Global Area  524288000 bytes
Fixed Size                  1220336 bytes
Variable Size             213909776 bytes
Database Buffers          301989888 bytes
Redo Buffers                7168000 bytes
Database mounted.
orcl>alter database set standby database to maximize protection;

Database altered.

orcl>alter database open;
alter database open
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel

--主库就突然down了
--查看备库,模式也没有任何改变
SQL> select database_role,protection_mode,protection_level from v$database;

DATABASE_ROLE    PROTECTION_MODE      PROTECTION_LEVEL
---------------- -------------------- --------------------
PHYSICAL STANDBY MAXIMUM PERFORMANCE  MAXIMUM PERFORMANCE


--日志 from primary database orcl

LGWR: Primary database is in MAXIMUM PROTECTION mode
LGWR: Destination LOG_ARCHIVE_DEST_2 is not serviced by LGWR
LGWR: Destination LOG_ARCHIVE_DEST_1 is not serviced by LGWR
LGWR: Minimum of 1 LGWR standby database required
Sun Oct 21 17:53:11 2012
Errors in file /u01/app/oracle/admin/orcl/bdump/orcl_lgwr_21162.trc:
ORA-16072: a minimum of one standby database destination is required
Sun Oct 21 17:53:11 2012
Errors in file /u01/app/oracle/admin/orcl/bdump/orcl_lgwr_21162.trc:
ORA-16072: a minimum of one standby database destination is required
LGWR: terminating instance due to error 16072
Instance terminated by LGWR, pid = 21162


--先修复主库的问题
SQL> alter database set standby database to maximize performance;

Database altered.

SQL> alter database open;

Database altered.

SQL> alter system set log_archive_dest_2='service=standby LGWR SYNC AFFIRM VALID_FOR=(ONLINE_LOGFILE,PRIMARY_ROLE) db_unique_name=standby';

System altered.

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.

Total System Global Area  524288000 bytes
Fixed Size                  1220336 bytes
Variable Size             213909776 bytes
Database Buffers          301989888 bytes
Redo Buffers                7168000 bytes
Database mounted.
SQL> show parameter log_archive_dest_2

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest_2                   string      service=standby LGWR SYNC AFFI
                                                 RM VALID_FOR=(ONLINE_LOGFILE,P
                                                 RIMARY_ROLE) db_unique_name=st
                                                 andby

SQL> alter database set standby database to maximize protection;

Database altered.

SQL> alter database open;

Database altered.

SQL>

--在open时,有的人遇到了一些问题。
可能是网络不好,需要设置timeout参数,改为
*.log_archive_dest_2='SERVICE=standby LGWR SYNC AFFIRM NET_TIMEOUT=120 VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=standby'
就可以了。

--再次查看备库的状态
SQL> select database_role,protection_mode,protection_level from v$database;--最开始的

DATABASE_ROLE    PROTECTION_MODE      PROTECTION_LEVEL
---------------- -------------------- --------------------
PHYSICAL STANDBY MAXIMUM PERFORMANCE  MAXIMUM PERFORMANCE

SQL> /

DATABASE_ROLE    PROTECTION_MODE      PROTECTION_LEVEL       --修改后的
---------------- -------------------- --------------------
PHYSICAL STANDBY MAXIMUM PROTECTION   MAXIMUM PROTECTION

SQL>

--在主库继续切换模式
SQL> alter database set standby database to maximize performance;

Database altered.
--备库也会相应的变化
SQL> /

DATABASE_ROLE    PROTECTION_MODE      PROTECTION_LEVEL
---------------- -------------------- --------------------
PHYSICAL STANDBY MAXIMUM PERFORMANCE  MAXIMUM PERFORMANCE


SQL> alter database set standby database to maximize availability;

Database altered.
--在备库查看模式
SQL> /

DATABASE_ROLE    PROTECTION_MODE      PROTECTION_LEVEL
---------------- -------------------- --------------------
PHYSICAL STANDBY MAXIMUM AVAILABILITY RESYNCHRONIZATION

 

--在主库查看归档情况

SQL>  select dest_id,status,error from v$archive_dest ;

   DEST_ID STATUS
---------- ---------
ERROR
-----------------------------------------------------------------
         1 VALID


         2 ERROR
ORA-16086: standby database does not contain available standby
log files
         3 BAD PARAM
ORA-16053: DB_UNIQUE_NAME  is not in the Data Guard Configuration


在备库的日志中查看,原来是超时导致的。
ORA-16086: standby database does not contain available standby log files
*** 2012-10-21 22:13:23.151 60679 kcrr.c
LGWR: Error 16086 creating archivelog file 'standby'
*** 2012-10-21 22:13:23.151 58941 kcrr.c
kcrrfail: dest:2 err:16086 force:0 blast:1
Receiving message from LNSb
*** 2012-10-21 22:13:23.176 55444 kcrr.c
Making upidhs request to LNSb (ocis 0x0xb70a58b8). Begin time is and NET_TIMEOUT seconds
NetServer pid:27564
*** 2012-10-21 22:13:27.195 55616 kcrr.c
  upidhs done status 0

在主库尝试加入timeout
SQL> alter system set log_archive_dest_2='service=standby LGWR SYNC AFFIRM timeout=200 valid_for=(online_logfile,primary_role) db_unique_name=standby'
  2  ;
alter system set log_archive_dest_2='service=standby LGWR SYNC AFFIRM timeout=200 valid_for=(online_logfile,primary_role) db_unique_name=standby'
*
ERROR at line 1:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-16025: parameter LOG_ARCHIVE_DEST_2 contains repeated or conflicting
attributes
--是设置timeout时参数名错误导致的

SQL> alter system set log_archive_dest_2='service=standby LGWR SYNC AFFIRM NET_TIMEOUT=200 valid_for=(online_logfile,primary_role) db_unique_name=standby'
  2  ;

System altered.

 

--switch over and failover

-primary database orcl
SQL> select database_role,switchover_status from v$database;

DATABASE_ROLE    SWITCHOVER_STATUS
---------------- --------------------
PRIMARY          SESSIONS ACTIVE


--standby database standby
SQL> select database_role,switchover_status from v$database;

DATABASE_ROLE    SWITCHOVER_STATUS
---------------- --------------------
PHYSICAL STANDBY NOT ALLOWED


ORA-16416: Switchover target is not synchronized with the primary

-未完待续

 

上一篇:技术人的灵魂3问,阿里工程师如何解答? | 8月28号栖夜读


下一篇:降本、赋能、链接:阿里巴巴全域数据建设 ——云栖大会阿里大数据分论坛精彩演讲1