Oracle控制文件在缺失归档日志的情况下的恢复
众所周知,恢复控制文件时“recover database”命令可能需要使用归档日志。所谓缺失归档日志,是指控制文件从备份还原之后,在执行“recover database”命令恢复时报告找不到相应的日志导致恢复终止的情况。
这种情况下的恢复操作主要步骤如下:
① 首先还原控制文件,方式不限
② 执行“recover database”命令将报RMAN-06054错误,即找不到某归档日志
③ 查看相关的动态性能视图,对问题定位,确认问题与控制文件,而不是数据文件相关(与数据文件相关必须进行不完全恢复)
④ 利用create controlfile 命令重建控制文件
⑤ 再次执行“recover database”命令,还会报RMAN-06054错误,这次是找不到另一个归档日志,其序列号应该大于第二步中的
⑥ 查看v$log视图确定第5步中所要的是哪个日志
⑦ 执行SQLPLUS的”recover database using backup controlfile“命令,等”Specify log:“提示符出现后给出正确的在线日志路径,直到命令成功结束。
⑧ 以resetlogs方式打开数据库
⑨ 由于创建的控制文件内不会有临时数据文件的信息,需要重新将其添加回临时表空间
⑩ 将控制文件内其他丢失的信息用catalog和configure等命令再添加回去。
当前current日志序列号为:5,此时进行控制文件备份
SQL> archive log list;
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch
Oldest online log sequence 3
Next log sequence to archive 5
Current log sequence 5
SQL>
RMAN> backup current controlfile;
Starting backup at 2015-02-04 16:28:13
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
channel ORA_DISK_1: starting piece 1 at 2015-02-04 16:28:14
channel ORA_DISK_1: finished piece 1 at 2015-02-04 16:28:15
piece handle=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/1vpuel4t_1_1 tag=TAG20150204T162813 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 2015-02-04 16:28:15
RMAN>
多次切换日志后,现在的CURRENT日志是20号,所有控制文件丢失并且第15号归档日志丢失,数据库启动后停留在了nomount状态:
SQL> alter system switch logfile;
。。。。。。。。
System altered.
SQL> archive log list;
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch
Oldest online log sequence 18
Next log sequence to archive 20
Current log sequence 20
SQL>
RMAN> delete archivelog sequence 15;
released channel: ORA_DISK_1
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=257 device type=DISK
List of Archived Log Copies for database with db_unique_name LILOVE
=====================================================================
Key Thrd Seq S Low Time
------- ---- ------- - -------------------
44 1 15 X 2015-02-04 16:29:58
Name: /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_15_870711361.dbf
Do you really want to delete the above objects (enter YES or NO)? yes
deleted archived log
archived log file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_15_870711361.dbf RECID=44 STAMP=870798598
Deleted 1 objects
RMAN>
[root@rhel6_lhr ~]# ll /u01/app/oracle/oradata/utf8test/*
-rw-r----- 1 oracle asmadmin 10076160 Feb 4 16:40 /u01/app/oracle/oradata/utf8test/control01.ctl
-rw-r----- 1 oracle oinstall 10076160 Feb 4 16:40 /u01/app/oracle/oradata/utf8test/control02.ctl
-rw-r----- 1 oracle asmadmin 52429312 Feb 4 16:30 /u01/app/oracle/oradata/utf8test/redo01.log
-rw-r----- 1 oracle asmadmin 52429312 Feb 4 16:40 /u01/app/oracle/oradata/utf8test/redo02.log
-rw-r----- 1 oracle asmadmin 52429312 Feb 4 16:30 /u01/app/oracle/oradata/utf8test/redo03.log
-rw-r----- 1 oracle asmadmin 608182272 Feb 4 16:39 /u01/app/oracle/oradata/utf8test/sysaux01.dbf
-rw-r----- 1 oracle asmadmin 775954432 Feb 4 16:39 /u01/app/oracle/oradata/utf8test/system01.dbf
-rw-r----- 1 oracle asmadmin 10493952 Feb 3 16:15 /u01/app/oracle/oradata/utf8test/tbs_read01.dbf
-rw-r----- 1 oracle asmadmin 20979712 Feb 4 11:15 /u01/app/oracle/oradata/utf8test/temp01.dbf
-rw-r----- 1 oracle asmadmin 52436992 Feb 4 16:39 /u01/app/oracle/oradata/utf8test/undotbs01.dbf
-rw-r----- 1 oracle asmadmin 10493952 Feb 4 16:30 /u01/app/oracle/oradata/utf8test/users01.dbf
[root@rhel6_lhr ~]# rm -rf /u01/app/oracle/oradata/utf8test/control0*
[root@rhel6_lhr ~]# ll /u01/app/oracle/oradata/utf8test/*
-rw-r----- 1 oracle asmadmin 52429312 Feb 4 16:30 /u01/app/oracle/oradata/utf8test/redo01.log
-rw-r----- 1 oracle asmadmin 52429312 Feb 4 16:40 /u01/app/oracle/oradata/utf8test/redo02.log
-rw-r----- 1 oracle asmadmin 52429312 Feb 4 16:30 /u01/app/oracle/oradata/utf8test/redo03.log
-rw-r----- 1 oracle asmadmin 608182272 Feb 4 16:39 /u01/app/oracle/oradata/utf8test/sysaux01.dbf
-rw-r----- 1 oracle asmadmin 775954432 Feb 4 16:39 /u01/app/oracle/oradata/utf8test/system01.dbf
-rw-r----- 1 oracle asmadmin 10493952 Feb 3 16:15 /u01/app/oracle/oradata/utf8test/tbs_read01.dbf
-rw-r----- 1 oracle asmadmin 20979712 Feb 4 11:15 /u01/app/oracle/oradata/utf8test/temp01.dbf
-rw-r----- 1 oracle asmadmin 52436992 Feb 4 16:39 /u01/app/oracle/oradata/utf8test/undotbs01.dbf
-rw-r----- 1 oracle asmadmin 10493952 Feb 4 16:30 /u01/app/oracle/oradata/utf8test/users01.dbf
[root@rhel6_lhr ~]#
SQL> startup force;
ORACLE instance started.
Total System Global Area 501059584 bytes
Fixed Size 2229744 bytes
Variable Size 356518416 bytes
Database Buffers 134217728 bytes
Redo Buffers 8093696 bytes
ORA-00205: error in identifying control file, check alert log for more info
SQL>
告警文件报错:
ALTER DATABASE MOUNT
ORA-00210: cannot open the specified control file
ORA-00202: control file: '/u01/app/oracle/oradata/utf8test/control02.ctl'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
ORA-00210: cannot open the specified control file
ORA-00202: control file: '/u01/app/oracle/oradata/utf8test/control01.ctl'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
ORA-205 signalled during: ALTER DATABASE MOUNT...
下面,我们开始恢复:
RMAN> restore controlfile from '/u01/app/oracle/product/11.2.0/dbhome_1/dbs/1vpuel4t_1_1';
Starting restore at 2015-02-04 16:44:10
using channel ORA_DISK_1
channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
output file name=/u01/app/oracle/oradata/utf8test/control01.ctl
output file name=/u01/app/oracle/oradata/utf8test/control02.ctl
Finished restore at 2015-02-04 16:44:11
RMAN>
查看控制文件的确已经恢复:
[root@rhel6_lhr ~]# ll /u01/app/oracle/oradata/utf8test/con*
-rw-r----- 1 oracle asmadmin 10076160 Feb 4 16:44 /u01/app/oracle/oradata/utf8test/control01.ctl
-rw-r----- 1 oracle asmadmin 10076160 Feb 4 16:44 /u01/app/oracle/oradata/utf8test/control02.ctl
[root@rhel6_lhr ~]#
下面我们挂载数据库:
RMAN> mount database;
database mounted
released channel: ORA_DISK_1
RMAN>
下边恢复数据库将报错,表示找不到15号归档文件:
RMAN> recover database;
Starting recover at 2015-02-04 16:47:55
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=10 device type=DISK
datafile 5 not processed because file is read-only
starting media recovery
archived log for thread 1 with sequence 18 is already on disk as file /u01/app/oracle/oradata/utf8test/redo03.log
archived log for thread 1 with sequence 19 is already on disk as file /u01/app/oracle/oradata/utf8test/redo01.log
archived log for thread 1 with sequence 20 is already on disk as file /u01/app/oracle/oradata/utf8test/redo02.log
archived log file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_5_870711361.dbf thread=1 sequence=5
archived log file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_6_870711361.dbf thread=1 sequence=6
archived log file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_7_870711361.dbf thread=1 sequence=7
archived log file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_8_870711361.dbf thread=1 sequence=8
archived log file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_9_870711361.dbf thread=1 sequence=9
archived log file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_10_870711361.dbf thread=1 sequence=10
archived log file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_11_870711361.dbf thread=1 sequence=11
archived log file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_12_870711361.dbf thread=1 sequence=12
archived log file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_13_870711361.dbf thread=1 sequence=13
archived log file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_14_870711361.dbf thread=1 sequence=14
unable to find archived log
archived log thread=1 sequence=15
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 02/04/2015 16:47:58
RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 15 and starting SCN of 1927288
RMAN>
若此时打开数据库,将报很多的错误:
RMAN> alter database open;
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of alter db command at 02/04/2015 16:50:38
ORA-01589: must use RESETLOGS or NORESETLOGS option for database open
RMAN> alter database open resetlogs;
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of alter db command at 02/04/2015 16:50:49
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/app/oracle/oradata/utf8test/system01.dbf'
分析原因,首先查看目前已知的归档文件最大的日志序列号是多少?
SQL> select max(sequence#) from v$archived_log;
MAX(SEQUENCE#)
--------------
20
SQL> archive log list;
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch
Oldest online log sequence 3
Next log sequence to archive 5
Current log sequence 5
SQL> select GROUP#,SEQUENCE#,MEMBERS,STATUS,ARCHIVED from v$log;
GROUP# SEQUENCE# MEMBERS STATUS ARC
---------- ---------- ---------- ---------------- ---
1 4 1 INACTIVE YES
3 3 1 INACTIVE YES
2 5 1 CURRENT NO
SQL>
答案为20,如果归档已经是20了,那么current日志一定是大于20的,而我的数据库的在线日志组数量为3个,也就是说在线日志的最小序列号大于17,进而得知所有数据文件的完全检查点必然超过了17号日志的最后一条重做记录。那么结论就是数据文件最多只需要17号之后的日志就能将恢复完成。
那么控制文件是从几号开始恢复的呢?由v$log可知是从5号开始恢复的,恢复到15号日志的时候报错了,所以我们只需要让控制文件放弃17号就可以顺利过关了。这个方法就是使用”create controlfile“创建一个新的控制文件。这个新的控制文件不知道current日志的序列号,不会强制所要任何日志对其恢复。
首先生成创建命令并重启至nomount状态:
SQL> alter database backup controlfile to trace as '/home/oracle/ctl.txt';
Database altered.
SQL> startup force nomount;
ORACLE instance started.
Total System Global Area 501059584 bytes
Fixed Size 2229744 bytes
Variable Size 356518416 bytes
Database Buffers 134217728 bytes
Redo Buffers 8093696 bytes
SQL>
我们在trace文件中得到并执行noresetlogs版本的”create controlfile“命令:
CREATE CONTROLFILE REUSE DATABASE "lilove" NORESETLOGS ARCHIVELOG
MAXLOGFILES 16
MAXLOGMEMBERS 3
MAXDATAFILES 100
MAXINSTANCES 8
MAXLOGHISTORY 292
LOGFILE
GROUP 1 '/u01/app/oracle/oradata/utf8test/redo01.log' SIZE 50M BLOCKSIZE 512,
GROUP 2 '/u01/app/oracle/oradata/utf8test/redo02.log' SIZE 50M BLOCKSIZE 512,
GROUP 3 '/u01/app/oracle/oradata/utf8test/redo03.log' SIZE 50M BLOCKSIZE 512
-- STANDBY LOGFILE
DATAFILE
'/u01/app/oracle/oradata/utf8test/system01.dbf',
'/u01/app/oracle/oradata/utf8test/sysaux01.dbf',
'/u01/app/oracle/oradata/utf8test/undotbs01.dbf',
'/u01/app/oracle/oradata/utf8test/users01.dbf'
CHARACTER SET AL32UTF8
;
将以上命令在sqlplus中执行,等”Control file created.“出现,数据库已经自动mount了。然后再执行recover database命令就将至少从17号日志开始,越过了15号这个阻碍:
RMAN> recover database;
Starting recover at 2015-02-04 17:21:17
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=10 device type=DISK
starting media recovery
unable to find archived log
archived log thread=1 sequence=20
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 02/04/2015 17:21:18
RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 20 and starting SCN of 1927308
RMAN>
从结果得到,15号不用了,但是报20号找不到,而20号归档是存在的,是在线日志,导致此问题的原因是新创建的控制文件有一个缺陷:使用这种控制文件恢复时RMAN通道只会一直地找归档日志,而无视在线日志。所以,恢复到尾声阶段的时候一定会报RMAN-06054错误,此时再查下v$log:
SQL> select GROUP#,SEQUENCE#,MEMBERS,STATUS,ARCHIVED from v$log;
GROUP# SEQUENCE# MEMBERS STATUS ARC
---------- ---------- ---------- ---------------- ---
1 19 1 INACTIVE NO
3 18 1 INACTIVE NO
2 20 1 CURRENT NO
SQL>
原来20号是在线日志,接下来使用sqlplus的”recover database using backup controlfile“命令,可以手动指定恢复过程中所使用的日志,然后resetlogs打开数据库:
SQL> recover database using backup controlfile;
ORA-00279: change 1927308 generated at 02/04/2015 16:30:05 needed for thread 1
ORA-00289: suggestion : /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_20_870711361.dbf
ORA-00280: change 1927308 for thread 1 is in sequence #20
Specify log: {=suggested | filename | AUTO | CANCEL}
/u01/app/oracle/oradata/utf8test/redo02.log
Log applied.
Media recovery complete.
SQL> alter database open resetlogs;
Database altered.
最后根据得到的控制文件trace中的内容执行如下语句:
VARIABLE RECNO NUMBER;
EXECUTE :RECNO := SYS.DBMS_BACKUP_RESTORE.SETCONFIG('BACKUP OPTIMIZATION','ON');
VARIABLE RECNO NUMBER;
EXECUTE :RECNO := SYS.DBMS_BACKUP_RESTORE.SETCONFIG('RETENTION POLICY','TO RECOVERY WINDOW OF 3 DAYS');
ALTER DATABASE ENABLE BLOCK CHANGE TRACKING USING FILE '/u01/app/oracle/changetracking/rman_change_track.ctf' REUSE;
ALTER DATABASE RENAME FILE 'MISSING00005' TO '/u01/app/oracle/oradata/utf8test/tbs_read01.dbf';
ALTER TABLESPACE "TBS_READ" ONLINE;
ALTER TABLESPACE TEMP ADD TEMPFILE '/u01/app/oracle/oradata/utf8test/temp01.dbf' REUSE;
VARIABLE RECNO NUMBER;
EXECUTE :RECNO := SYS.DBMS_BACKUP_RESTORE.SETCONFIG('BACKUP OPTIMIZATION','ON');
VARIABLE RECNO NUMBER;
SQL>
PL/SQL procedure successfully completed.
SQL> SQL> EXECUTE :RECNO := SYS.DBMS_BACKUP_RESTORE.SETCONFIG('RETENTION POLICY','TO RECOVERY WINDOW OF 3 DAYS');
PL/SQL procedure successfully completed.
SQL> ALTER DATABASE ENABLE BLOCK CHANGE TRACKING USING FILE '/u01/app/oracle/changetracking/rman_change_track.ctf' REUSE;
Database altered.
SQL> ALTER DATABASE RENAME FILE 'MISSING00005' TO '/u01/app/oracle/oradata/utf8test/tbs_read01.dbf';
Database altered.
SQL> ALTER TABLESPACE "TBS_READ" ONLINE;
Tablespace altered.
SQL> ALTER TABLESPACE TEMP ADD TEMPFILE '/u01/app/oracle/oradata/utf8test/temp01.dbf' REUSE;
Tablespace altered.
SQL>
最后不要忘记全备数据库。