处理过程参考了:
https://blogs.oracle.com/Database4CN/entry/%E5%A6%82%E4%BD%95%E9%87%8D%E5%BB%BArac%E7%9A%84%E6%8E%A7%E5%88%B6%E6%96%87%E4%BB%B6
问题现象:
现场有学校提报 登录PL/SQL连接数据库是报错“ORA-12541: TNS:无监听程序 ”;排查日志,发现
Tue Nov 25 14:46:58 2014
Thread 2 advanced to log sequence 18098 (LGWR switch)
Current log# 4 seq# 18098 mem# 0: +DATADG/urpdb/onlinelog/group_4.266.821439273
Tue Nov 25 14:46:58 2014
Archived Log entry 50394 added for thread 2 sequence 18097 ID 0xb0471722 dest 1:
Tue Nov 25 14:54:49 2014
Read from controlfile member '+DATADG/urpdb/controlfile/current.260.821439141' has found a corrupted block (blk# 35, cf seq# 0)
Hex dump of (file 0, block 35) in trace file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_lmon_10377.trc
Corrupt block relative dba: 0x00000023 (file 0, block 35)
Completely zero block found during control file block read
Hex dump of (file 0, block 35) in trace file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_lmon_10377.trc
Corrupt block relative dba: 0x00000023 (file 0, block 35)
Completely zero block found during control file block read
Errors in file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_lmon_10377.trc:
ORA-00202: control file: '+DATADG/urpdb/controlfile/current.260.821439141'
Errors in file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_lmon_10377.trc (incident=15289):
ORA-00227: corrupt block detected in control file: (block 35, # blocks 1)
ORA-00202: control file: '+DATADG/urpdb/controlfile/current.260.821439141'
Incident details in: /u01/app/oracle/diag/rdbms/urpdb/urpdb2/incident/incdir_15289/urpdb2_lmon_10377_i15289.trc
Tue Nov 25 14:54:51 2014
Dumping diagnostic data in directory=[cdmp_20141125145451], requested by (instance=2, osid=10377 (LMON)), summary=[incident=15289].
Errors in file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_lmon_10377.trc:
ORA-00227: corrupt block detected in control file: (block 35, # blocks 1)
ORA-00202: control file: '+DATADG/urpdb/controlfile/current.260.821439141'
LMON (ospid: 10377): terminating the instance due to error 227
System state dump requested by (instance=2, osid=10377 (LMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_diag_10357.trc
Instance terminated by LMON, pid = 10377
Thread 2 advanced to log sequence 18098 (LGWR switch)
Current log# 4 seq# 18098 mem# 0: +DATADG/urpdb/onlinelog/group_4.266.821439273
Tue Nov 25 14:46:58 2014
Archived Log entry 50394 added for thread 2 sequence 18097 ID 0xb0471722 dest 1:
Tue Nov 25 14:54:49 2014
Read from controlfile member '+DATADG/urpdb/controlfile/current.260.821439141' has found a corrupted block (blk# 35, cf seq# 0)
Hex dump of (file 0, block 35) in trace file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_lmon_10377.trc
Corrupt block relative dba: 0x00000023 (file 0, block 35)
Completely zero block found during control file block read
Hex dump of (file 0, block 35) in trace file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_lmon_10377.trc
Corrupt block relative dba: 0x00000023 (file 0, block 35)
Completely zero block found during control file block read
Errors in file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_lmon_10377.trc:
ORA-00202: control file: '+DATADG/urpdb/controlfile/current.260.821439141'
Errors in file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_lmon_10377.trc (incident=15289):
ORA-00227: corrupt block detected in control file: (block 35, # blocks 1)
ORA-00202: control file: '+DATADG/urpdb/controlfile/current.260.821439141'
Incident details in: /u01/app/oracle/diag/rdbms/urpdb/urpdb2/incident/incdir_15289/urpdb2_lmon_10377_i15289.trc
Tue Nov 25 14:54:51 2014
Dumping diagnostic data in directory=[cdmp_20141125145451], requested by (instance=2, osid=10377 (LMON)), summary=[incident=15289].
Errors in file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_lmon_10377.trc:
ORA-00227: corrupt block detected in control file: (block 35, # blocks 1)
ORA-00202: control file: '+DATADG/urpdb/controlfile/current.260.821439141'
LMON (ospid: 10377): terminating the instance due to error 227
System state dump requested by (instance=2, osid=10377 (LMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/urpdb/urpdb2/trace/urpdb2_diag_10357.trc
Instance terminated by LMON, pid = 10377
从日志可以看出,控制文件损坏了,导致数据库实例终止。
解决办法:
因为日志文件、数据文件都是好的,只是唯一的一份控制文件损坏了,只能通过重建控制文件方法解决;
在RAC上重建控制文件与单实例有一些小区别:
在重建控制文件前必须设置cluster_database=false,而且只启动一个实例来执行操作,否则会报错
ORA-01503: CREATE CONTROLFILE failed
ORA-12720: operation requires database is in EXCLUSIVE mode
ORA-01503: CREATE CONTROLFILE failed
ORA-12720: operation requires database is in EXCLUSIVE mode
重建完控制文件后,需要再将cluster_database设为true.
处理步骤:
1.首先生成重建控制文件的脚本:
SQL> alter database backup controlfile to trace;
SQL> alter database backup controlfile to trace;
2.数据库的Alert log中会详细输出这个文件的路径和名字:
Tue Nov 25 22:32:04 2014
alter database backup controlfile to trace
Backup controlfile written to trace file /u01/app/oracle/diag/rdbms/urpdb/urpdb1/trace/urpdb1_ora_6233.trc
Completed: alter database backup controlfile to trace
3.启动一个实例
SQL> startup nomount;
4.修改cluster_database属性
SQL> alter system set cluster_database=false scope=spfile;
SQL> shutdown immediate
SQL> startup nomount;
5.从urpdb1_ora_6233.trc 中获取创建控制文件脚本(noresetlogs的)执行
SQL> CREATE CONTROLFILE REUSE DATABASE "URPDB" NORESETLOGS ARCHIVELOG
MAXLOGFILES 192
MAXLOGMEMBERS 3
MAXDATAFILES 1024
MAXINSTANCES 32
MAXLOGHISTORY 18688
LOGFILE
GROUP 1 '+DATADG/urpdb/onlinelog/group_1.261.821439143' SIZE 50M BLOCKSIZE 512,
GROUP 2 '+DATADG/urpdb/onlinelog/group_2.262.821439143' SIZE 50M BLOCKSIZE 512,
GROUP 3 '+DATADG/urpdb/onlinelog/group_3.265.821439273' SIZE 50M BLOCKSIZE 512,
GROUP 4 '+DATADG/urpdb/onlinelog/group_4.266.821439273' SIZE 50M BLOCKSIZE 512
DATAFILE
'+DATADG/urpdb/datafile/system.256.821439061',
'+DATADG/urpdb/datafile/sysaux.257.821439061',
'+DATADG/urpdb/datafile/undotbs1.258.821439061',
'+DATADG/urpdb/datafile/users.259.821439061',
'+DATADG/urpdb/datafile/undotbs2.264.821439205',
'+DATADG/urpdb/datafile/tsp_imp_d.dbf',
'+DATADG/urpdb/datafile/tsp_dpstar.dbf',
'+DATADG/urpdb/datafile/ts_portal_d.dbf',
'+DATADG/urpdb/datafile/ts_w5_d.dbf',
'+DATADG/urpdb/datafile/ts_wap_d.dbf',
'+DATADG/urpdb/datafile/ts_oa_d.dbf',
'+DATADG/urpdb/datafile/tsp_sastar.dbf',
'+DATADG/urpdb/datafile/tsp_portal.dbf'
CHARACTER SET AL32UTF8
;
MAXLOGFILES 192
MAXLOGMEMBERS 3
MAXDATAFILES 1024
MAXINSTANCES 32
MAXLOGHISTORY 18688
LOGFILE
GROUP 1 '+DATADG/urpdb/onlinelog/group_1.261.821439143' SIZE 50M BLOCKSIZE 512,
GROUP 2 '+DATADG/urpdb/onlinelog/group_2.262.821439143' SIZE 50M BLOCKSIZE 512,
GROUP 3 '+DATADG/urpdb/onlinelog/group_3.265.821439273' SIZE 50M BLOCKSIZE 512,
GROUP 4 '+DATADG/urpdb/onlinelog/group_4.266.821439273' SIZE 50M BLOCKSIZE 512
DATAFILE
'+DATADG/urpdb/datafile/system.256.821439061',
'+DATADG/urpdb/datafile/sysaux.257.821439061',
'+DATADG/urpdb/datafile/undotbs1.258.821439061',
'+DATADG/urpdb/datafile/users.259.821439061',
'+DATADG/urpdb/datafile/undotbs2.264.821439205',
'+DATADG/urpdb/datafile/tsp_imp_d.dbf',
'+DATADG/urpdb/datafile/tsp_dpstar.dbf',
'+DATADG/urpdb/datafile/ts_portal_d.dbf',
'+DATADG/urpdb/datafile/ts_w5_d.dbf',
'+DATADG/urpdb/datafile/ts_wap_d.dbf',
'+DATADG/urpdb/datafile/ts_oa_d.dbf',
'+DATADG/urpdb/datafile/tsp_sastar.dbf',
'+DATADG/urpdb/datafile/tsp_portal.dbf'
CHARACTER SET AL32UTF8
;
6.确认redo log,thread#是正确的
SQL> select * from v$log;
SQL> select * from v$log;
7.恢复数据库、启动数据库、恢复临时文件(这些步骤,在urpdb1_ora_6233.trc 写的很清楚)
SQL> recover database;
Media recovery complete.
SQL> alter system archive log all;
SQL> alter database open;
SQL> ALTER TABLESPACE TEMP ADD TEMPFILE '+DATADG/urpdb/tempfile/temp.263.821439145' REUSE;
SQL> ALTER TABLESPACE TSP_IMP_TEMP ADD TEMPFILE '+DATADG/urpdb/tempfile/tsp_imp_temp.dbf' REUSE;
SQL> ALTER TABLESPACE TSP_DPSTAR_TEMP ADD TEMPFILE '+DATADG/urpdb/tempfile/tsp_dpstar_temp.dbf' REUSE;
SQL> ALTER TABLESPACE TS_PORTAL_TEMP ADD TEMPFILE '+DATADG/urpdb/datafile/ts_portal_temp.dbf' REUSE;
SQL> ALTER TABLESPACE TS_W5_TEMP ADD TEMPFILE '+DATADG/urpdb/datafile/ts_w5_temp.dbf' REUSE;
SQL> ALTER TABLESPACE TS_WAP_TEMP ADD TEMPFILE '+DATADG/urpdb/datafile/ts_wap_temp.dbf' REUSE;
8.还原cluster_database属性
SQL> alter system set cluster_database=true scope=spfile;
SQL> shutdown immediate
9.重启集群服务
./crsctl start cluser 问题解决。
10.登录另外一台服务器查看数据库状态,如有问题进行相应处理。
总结:
1.开启数据库控制文件自动备份很重要:
本数据库由于没有打开控制文件自动备份功能,只能采用重建方法。如果开启了,采用该控制文件,进行recover database using backup controlfile会更简单一些(有待测试验证)