【OGG】 RAC环境下管理OGG的高可用 (五)
一.1 BLOG文档结构图
一.2 前言部分
一.2.1 导读
各位技术爱好者,看完本文后,你可以掌握如下的技能,也可以学到一些其它你所不知道的知识,~O(∩_∩)O~:
① RAC环境下管理OGG的高可用
注意:本篇BLOG中代码部分需要特别关注的地方我都用黄色背景和红色字体来表示,比如下边的例子中,thread 1的最大归档日志号为33,thread 2的最大归档日志号为43是需要特别关注的地方。
List of Archived Logs in backup set 11
Thrd Seq Low SCN Low Time Next SCN Next Time
---- ------- ---------- ------------------- ---------- ---------
1 32 1621589 2015-05-29 11:09:52 1625242 2015-05-29 11:15:48
1 33 1625242 2015-05-29 11:15:48 1625293 2015-05-29 11:15:58
2 42 1613951 2015-05-29 10:41:18 1625245 2015-05-29 11:15:49
2 43 1625245 2015-05-29 11:15:49 1625253 2015-05-29 11:15:53
本文如有错误或不完善的地方请大家多多指正,ITPUB留言或QQ皆可,您的批评指正是我写作的最大动力。
一.2.2 实验环境介绍
项目 |
source db |
target db |
db 类型 |
rac |
单实例 |
db version |
11.2.0.1 |
11.2.0.1 |
db 存储 |
ASM |
FS type |
ORACLE_SID |
jmrac1/jmrac2 |
orcl |
db_name |
jmrac |
orcl |
主机IP地址: |
192.168.1.31/192.168.1.32 |
192.168.1.128 |
OS版本及kernel版本 |
RHEL5.7 64位,2.6.18-274.el5 |
RHEL6.5 64位,2.6.32-504.16.2.el6.x86_64 |
OGG版本 |
11.2.1.0.1 64位 |
11.2.1.0.1 64位 |
OS hostname |
node1/node2 |
orcltest |
一.2.3 相关参考文章链接
【OGG】OGG的下载和安装篇:http://blog.itpub.net/26736162/viewspace-1693241/
【OGG】OGG的单向DML复制配置(一):http://blog.itpub.net/26736162/viewspace-1696020/
【OGG】OGG的单向复制配置-支持DDL(二):http://blog.itpub.net/26736162/viewspace-1696031/
【OGG】OGG简单配置双向复制(三):http://blog.itpub.net/26736162/viewspace-1699516/
【OGG】RAC环境下配置OGG单向同步 (四):http://blog.itpub.net/26736162/viewspace-1699522/
一.2.4 本文简介
本文基于grid来管理OGG软件,主要参考网址为:http://ylw6006.blog.51cto.com/all/470441/16 ,非常感谢斩月大师。
在本文中将介绍在oracle 11g r2 rac环境下使用grid infrastructure来管理ogg服务,在开始之前,请先按照前文的步骤配置好rac(source)同单实例(target)数据库之间的ogg单向同步,同时确保rac数据库各项服务运行正常,同时启动target端的数据库实例和ogg的mgr,replicat进程,source端ogg的mgr,extract,extract dump进程可以关闭!
一.3 实验部分
一.3.1 查看rac各服务是否online,这里3个gsd服务offline属于正常情况!
[root@node2 ~]# crsstat
Name Type Target State Host
------------------------------ -------------------------- ---------- --------- -------
ora.ARCH.dg ora.diskgroup.type ONLINE ONLINE node1
ora.DATA.dg ora.diskgroup.type ONLINE ONLINE node1
ora.LISTENER.lsnr ora.listener.type ONLINE ONLINE node1
ora.LISTENER_SCAN1.lsnr ora.scan_listener.type ONLINE ONLINE node2
ora.OVDISK.dg ora.diskgroup.type ONLINE ONLINE node1
ora.TEST.dg ora.diskgroup.type ONLINE ONLINE node1
ora.asm ora.asm.type ONLINE ONLINE node1
ora.db.db ora.database.type ONLINE OFFLINE
ora.eons ora.eons.type ONLINE ONLINE node1
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora.jmrac.db ora.database.type ONLINE ONLINE node1
ora.jmrac.haha.svc ora.service.type ONLINE ONLINE node1
ora.net1.network ora.network.type ONLINE ONLINE node1
ora.node1.ASM1.asm application ONLINE ONLINE node1
ora.node1.LISTENER_NODE1.lsnr application ONLINE ONLINE node1
ora.node1.gsd application OFFLINE OFFLINE
ora.node1.ons application ONLINE ONLINE node1
ora.node1.vip ora.cluster_vip_net1.type ONLINE ONLINE node1
ora.node2.ASM2.asm application ONLINE ONLINE node2
ora.node2.LISTENER_NODE2.lsnr application ONLINE ONLINE node2
ora.node2.gsd application OFFLINE OFFLINE
ora.node2.ons application ONLINE ONLINE node2
ora.node2.vip ora.cluster_vip_net1.type ONLINE ONLINE node2
ora.oc4j ora.oc4j.type OFFLINE OFFLINE
ora.ons ora.ons.type ONLINE ONLINE node1
ora.ora11g.db ora.database.type OFFLINE OFFLINE
ora.orastrac.db ora.database.type OFFLINE OFFLINE
ora.registry.acfs ora.registry.acfs.type ONLINE ONLINE node1
ora.scan1.vip ora.scan_vip.type ONLINE ONLINE node2
[root@node2 ~]#
[root@node2 ~]# crsstat | grep OFFLINE
ora.db.db ora.database.type ONLINE OFFLINE
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora.node1.gsd application OFFLINE OFFLINE
ora.node2.gsd application OFFLINE OFFLINE
ora.oc4j ora.oc4j.type OFFLINE OFFLINE
ora.ora11g.db ora.database.type OFFLINE OFFLINE
ora.orastrac.db ora.database.type OFFLINE OFFLINE
[root@node2 ~]#
一.3.2 在source端添加VIP,并赋权限,检查vip是否能正常启动,在本例中运行grid infrastructure的操作系统用户为grid,运行ogg的操作系统用户为oracle
[root@node2 ~]# crsctl stat res -p |grep -ie .network -ie subnet |grep -ie name -ie subnet
NAME=ora.net1.network
USR_ORA_SUBNET=192.168.1.0
[root@node2 ~]# appvipcfg create -network=1 \
> -ip=192.168.1.150 \
> -vipname=oggvip \
> -user=root
Production Copyright 2007, 2008, Oracle.All rights reserved
2015-06-11 17:09:18: Creating Resource Type
2015-06-11 17:09:18: Executing cmd: /u01/grid/bin/crsctl add type app.appvip.type -basetype cluster_resource -file /u01/grid/crs/template/appvip.type
2015-06-11 17:09:18: Create the Resource
2015-06-11 17:09:18: Executing cmd: /u01/grid/bin/crsctl add resource oggvip -type app.appvip.type -attr USR_ORA_VIP=192.168.1.150,START_DEPENDENCIES=hard(ora.net1.network) pullup(ora.net1.network),STOP_DEPENDENCIES=hard(ora.net1.network),ACL='owner:root:rwx,pgrp:root:r-x,other::r--,user:root:r-x'
[root@node2 ~]#
[root@node2 ~]# crsctl setperm resource oggvip -u user:oracle:r-x
[root@node2 ~]# crsctl status resource oggvip
NAME=oggvip
TYPE=app.appvip.type
TARGET=OFFLINE
STATE=OFFLINE
[root@node2 ~]# crsctl start resource oggvip
CRS-2672: Attempting to start 'oggvip' on 'node1'
CRS-2676: Start of 'oggvip' on 'node1' succeeded
[root@node2 ~]# crsctl status resource oggvip
NAME=oggvip
TYPE=app.appvip.type
TARGET=ONLINE
STATE=ONLINE on node1
[root@node2 ~]#
[root@node2 ~]# crsstat | grep oggvip
oggvip app.appvip.type ONLINE ONLINE node1
[root@node2 ~]#
一.3.3 配置action脚本,该脚本必须包含start,stop,check,clean,abort几个函数,才能用于后续的grid infrastructure调用,这里把脚本直接放在acfs文件系统上,以便节点间共享
[oracle@node1 gg11]$ chmod +x $OGG_HOME/11gr2_ogg_action.scr
[oracle@node1 gg11]$ ll $OGG_HOME/11gr2_ogg_action.scr
-rwxr-xr-x 1 oracle oinstall 2695 Jun 11 17:13 /u01/app/acfsmounts/acfsvol1-232/gg11/11gr2_ogg_action.scr
[oracle@node1 gg11]$ cat $OGG_HOME/11gr2_ogg_action.scr
#!/bin/sh
#set the Oracle Goldengate installation directory
export OGG_HOME=/u01/app/acfsmounts/acfsvol1-232/gg11
#set the oracle home to the database to ensure GoldenGate will get the
#right environment settings to be able to connect to the database
export ORACLE_HOME=/u01/app/oracle/product/11.2.0/db1
#specify delay after start before checking for successful start
start_delay_secs=5
#Include the GoldenGate home in the library path to start GGSCI
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:${OGG_HOME}:${LD_LIBRARY_PATH}
#check_process validates that a manager process is running at the PID
#that GoldenGate specifies.
check_process () {
if ( [ -f "${OGG_HOME}/dirpcs/MGR.pcm" ] )
then
pid=`cut -f8 "${OGG_HOME}/dirpcs/MGR.pcm"`
if [ ${pid} = `ps -e |grep ${pid} |grep mgr |cut -d " " -f2` ]
then
#manager process is running on the PID exit success
exit 0
else
if [ ${pid} = `ps -e |grep ${pid} |grep mgr |cut -d " " -f1` ]
then
#manager process is running on the PID exit success
exit 0
else
#manager process is not running on the PID
exit 1
fi
fi
else
#manager is not running because there is no PID file
exit 1
fi
}
#call_ggsci is a generic routine that executes a ggsci command
call_ggsci () {
ggsci_command=$1
ggsci_output=`${OGG_HOME}/ggsci<<EOF
${ggsci_command}
exit
EOF`
}
case $1 in
'start')
#start manager
call_ggsci 'start manager'
#there is a small delay between issuing the start manager command
#and the process being spawned on the OS. wait before checking
sleep ${start_delay_secs}
#check whether manager is running and exit accordingly
check_process
;;
'stop')
#attempt a clean stop for all non-manager processes
#call_ggsci 'stop er *'
#ensure everything is stopped
call_ggsci 'stop er *!'
#call_ggsci 'kill er *'
#stop manager without (y/n) confirmation
call_ggsci 'stop manager!'
#exit success
exit 0
;;
'check')
check_process
;;
'clean')
#attempt a clean stop for all non-manager processes
#call_ggsci 'stop er *'
#ensure everything is stopped
#call_ggsci 'stop er *!'
#in case there are lingering processes
call_ggsci 'kill er *'
#stop manager without (y/n) confirmation
call_ggsci 'stop manager!'
#exit success
exit 0
;;
'abort')
#ensure everything is stopped
call_ggsci 'stop er *!'
#in case there are lingering processes
call_ggsci 'kill er *'
#stop manager without (y/n) confirmation
call_ggsci 'stop manager!'
#exit success
exit 0
;;
esac
[oracle@node1 gg11]$
一.3.4 使用oracle用户添加oggapp,并授权给oracle用户管理
[root@node2 ~]# crsctl add resource oggapp -type cluster_resource \
> -attr "ACTION_SCRIPT=/u01/app/acfsmounts/acfsvol1-232/gg11/11gr2_ogg_action.scr, \
> CHECK_INTERVAL=30, START_DEPENDENCIES='hard(oggvip,ora.asm) \
> pullup(oggvip)', STOP_DEPENDENCIES='hard(oggvip)'"
[root@node2 ~]#
[root@node2 ~]# crsctl status resource oggapp
NAME=oggapp
TYPE=cluster_resource
TARGET=OFFLINE
STATE=OFFLINE
[root@node2 ~]# crsctl setperm resource oggapp -o oracle
[root@node2 ~]#
一.3.5 检查target端的数据库是否可以正常连接,ogg进程是否运行正常
[oracle@orcltest ~]$ sqlplus test/test@orcl
SQL*Plus: Release 11.2.0.1.0 Production on Thu Jun 11 17:19:23 2015
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@orcltest ~]$ cd $OGG_HOME
[oracle@orcltest gg11]$ ggsci
Oracle GoldenGate Command Interpreter for Oracle
Version 11.2.1.0.1 OGGCORE_11.2.1.0.1_PLATFORMS_120423.0230_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Apr 23 2012 08:32:14
Copyright (C) 1995, 2012, Oracle and/or its affiliates. All rights reserved.
GGSCI (orcltest) 1> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT RUNNING TESTRPT 00:00:00 00:00:10
GGSCI (orcltest) 2>
一.3.6 检查source端的数据库是否可以正常连接,ogg进程处于未启动状态
[oracle@node1 gg11]$ sqlplus test/test@rac
SQL*Plus: Release 11.2.0.1.0 Production on Thu Jun 11 17:22:08 2015
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> select * from tab;
TNAME TABTYPE CLUSTERID
------------------------------ ------- ----------
RAC_TEST TABLE
T1 TABLE
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
[oracle@node1 gg11]$ cd $OGG_HOME
[oracle@node1 gg11]$ ggsci
Oracle GoldenGate Command Interpreter for Oracle
Version 11.2.1.0.1 OGGCORE_11.2.1.0.1_PLATFORMS_120423.0230_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Apr 23 2012 08:32:14
Copyright (C) 1995, 2012, Oracle and/or its affiliates. All rights reserved.
GGSCI (node1) 1> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING TESTEXT 00:00:00 00:00:01
EXTRACT RUNNING TESTPUMP 00:00:00 00:00:03
GGSCI (node1) 2> stop mgr
Manager process is required by other GGS processes.
Are you sure you want to stop it (y/n)? y
Sending STOP request to MANAGER ...
Request processed.
Manager stopped.
GGSCI (node1) 3> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER STOPPED
EXTRACT RUNNING TESTEXT 00:00:00 00:00:00
EXTRACT RUNNING TESTPUMP 00:00:00 00:00:09
GGSCI (node1) 4> stop *
Sending STOP request to EXTRACT TESTEXT ...
Request processed.
Sending STOP request to EXTRACT TESTPUMP ...
Request processed.
GGSCI (node1) 5> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER STOPPED
EXTRACT STOPPED TESTEXT 00:00:00 00:00:01
EXTRACT STOPPED TESTPUMP 00:00:00 00:00:01
GGSCI (node1) 6>
一.3.7 启动oggapp resource,检查source(rac1)上的ogg进程是否成功启动
[root@node2 ~]# crsctl status resource oggapp
NAME=oggapp
TYPE=cluster_resource
TARGET=OFFLINE
STATE=OFFLINE
[root@node2 ~]# crsctl start resource oggapp
CRS-2672: Attempting to start 'oggapp' on 'node1'
CRS-2676: Start of 'oggapp' on 'node1' succeeded
[root@node2 ~]#
[oracle@node1 gg11]$ ggsci
Oracle GoldenGate Command Interpreter for Oracle
Version 11.2.1.0.1 OGGCORE_11.2.1.0.1_PLATFORMS_120423.0230_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Apr 23 2012 08:32:14
Copyright (C) 1995, 2012, Oracle and/or its affiliates. All rights reserved.
GGSCI (node1) 1> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING TESTEXT 00:00:00 00:00:00
EXTRACT RUNNING TESTPUMP 00:00:00 00:00:09
GGSCI (node1) 2>
[root@node2 ~]# crs_stat -t -v oggapp
Name Type R/RA F/FT Target State Host
----------------------------------------------------------------------
oggapp clus...esource 0/1 0/0 ONLINE ONLINE node1
[root@node2 ~]#
一.3.8 测试failover
[root@node2 ~]# crsctl relocate resource oggapp -f
CRS-2673: Attempting to stop 'oggapp' on 'node1'
CRS-2677: Stop of 'oggapp' on 'node1' succeeded
CRS-2673: Attempting to stop 'oggvip' on 'node1'
CRS-2677: Stop of 'oggvip' on 'node1' succeeded
CRS-2672: Attempting to start 'oggvip' on 'node2'
CRS-2676: Start of 'oggvip' on 'node2' succeeded
CRS-2672: Attempting to start 'oggapp' on 'node2'
CRS-2676: Start of 'oggapp' on 'node2' succeeded
[root@node2 ~]# crs_stat -t -v oggapp
Name Type R/RA F/FT Target State Host
----------------------------------------------------------------------
oggapp clus...esource 0/1 0/0 ONLINE ONLINE node2
[root@node2 ~]# su - oracle
[oracle@node2 ~]$ cd $OGG_HOME
[oracle@node2 gg11]$ ggsci
Oracle GoldenGate Command Interpreter for Oracle
Version 11.2.1.0.1 OGGCORE_11.2.1.0.1_PLATFORMS_120423.0230_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Apr 23 2012 08:32:14
Copyright (C) 1995, 2012, Oracle and/or its affiliates. All rights reserved.
GGSCI (node2) 1> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING TESTEXT 00:00:00 00:00:07
EXTRACT RUNNING TESTPUMP 00:00:00 00:00:08
GGSCI (node2) 2>
可以看到节点到了第二个节点上了。