记录一次断电的处理的Oracle数据库恢复的过程

2021-08-09 16:25:36

由于笔记本出现没有插电源，虚拟机不正常关机。再次启动数据库的时候出现故障。testdb数据库启动不了，报错如下：

$ORACLE_SID: testdb1
$ORACLE_HOME: /oracle/db11g
$GRID_HOME: /grid/grid_home
 
oracle@testdb1[testdb1]/home/oracle$srvctl start database -d testdb
PRCR-1079 : Failed to start resource ora.testdb.db
CRS-5017: The resource action "ora.testdb.db start" encountered the following error: 
ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [159], [26797], [26821], [], [], [], [], [], [], []
. For details refer to "(:CLSN00107:)" in "/grid/grid_home/log/testdb2/agent/crsd/oraagent_oracle/oraagent_oracle.log".
CRS-5017: The resource action "ora.testdb.db start" encountered the following error: 
ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [159], [26797], [26821], [], [], [], [], [], [], []
. For details refer to "(:CLSN00107:)" in "/grid/grid_home/log/testdb1/agent/crsd/oraagent_oracle/oraagent_oracle.log".
CRS-2674: Start of 'ora.testdb.db' on 'testdb2' failed
CRS-2674: Start of 'ora.testdb.db' on 'testdb1' failed
CRS-2632: There are no more servers to try to place resource 'ora.testdb.db' on that would satisfy its placement policy

查看日志：

oracle@testdb1[testdb1]/home/oracle$cd /oracle/diag/rdbms/testdb/testdb1/trace
oracle@testdb1[testdb1]/oracle/diag/rdbms/testdb/testdb1/trace$tail -100 alert_testdb1.log 
lmon registered with NM - instance number 1 (internal mem no 0)
Reconfiguration started (old inc 0, new inc 3)
List of instances:
 1 2 (myinst: 1) 
 Global Resource Directory frozen
* allocate domain 0, invalid = TRUE 
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
 Set master node info 
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 Post SMON to start 1st pass IR
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
Sun May 16 09:28:21 2021
LCK0 started with pid=28, OS id=5737 
Starting background process RSMN
Sun May 16 09:28:21 2021
RSMN started with pid=29, OS id=5743 
ORACLE_BASE not set in environment. It is recommended
that ORACLE_BASE be set in the environment
Reusing ORACLE_BASE from an earlier startup = /oracle
Sun May 16 09:28:22 2021
ALTER SYSTEM SET local_listener=' (ADDRESS=(PROTOCOL=TCP)(HOST=172.16.0.11)(PORT=1521))' SCOPE=MEMORY SID='testdb1';
ALTER DATABASE MOUNT /* db agent *//* {1:27802:343} */
Sun May 16 09:28:24 2021
Sweep [inc][59045]: completed
Sweep [inc2][59045]: completed
NOTE: Loaded library: System 
SUCCESS: diskgroup DGSYS was mounted
NOTE: dependency between database testdb and diskgroup resource ora.DGSYS.dg is established
Sun May 16 09:28:32 2021
Successful mount of redo thread 1, with mount id 2855981366
Sun May 16 09:28:32 2021
Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
Lost write protection disabled
Completed: ALTER DATABASE MOUNT /* db agent *//* {1:27802:343} */
ALTER DATABASE OPEN /* db agent *//* {1:27802:343} */
This instance was first to open
Sun May 16 09:28:33 2021
SUCCESS: diskgroup DGDATA was mounted
SUCCESS: diskgroup DGARCH was mounted
Block change tracking file is current.
Sun May 16 09:28:33 2021
NOTE: dependency between database testdb and diskgroup resource ora.DGDATA.dg is established
Beginning crash recovery of 2 threads
NOTE: dependency between database testdb and diskgroup resource ora.DGARCH.dg is established
Sun May 16 09:28:34 2021
Reconfiguration started (old inc 3, new inc 5)
List of instances:
 1 (myinst: 1) 
 Global Resource Directory frozen
 * dead instance detected - domain 0 invalid = TRUE 
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Sun May 16 09:28:34 2021
 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
 Set master node info 
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 Post SMON to start 1st pass IR
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
 parallel recovery started with 2 processes
Started redo scan
Completed redo scan
 read 3196 KB redo, 935 data blocks need recovery
Errors in file /oracle/diag/rdbms/testdb/testdb1/trace/testdb1_ora_5745.trc  (incident=60246):
ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [159], [26797], [26821], [], [], [], [], [], [], []
Incident details in: /oracle/diag/rdbms/testdb/testdb1/incident/incdir_60246/testdb1_ora_5745_i60246.trc
Sun May 16 09:28:35 2021
Dumping diagnostic data in directory=[cdmp_20210516092835], requested by (instance=1, osid=5745), summary=[incident=60246].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Abort recovery for domain 0
Aborting crash recovery due to error 600
Errors in file /oracle/diag/rdbms/testdb/testdb1/trace/testdb1_ora_5745.trc:
ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [159], [26797], [26821], [], [], [], [], [], [], []
Abort recovery for domain 0
Errors in file /oracle/diag/rdbms/testdb/testdb1/trace/testdb1_ora_5745.trc:
ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [159], [26797], [26821], [], [], [], [], [], [], []
ORA-600 signalled during: ALTER DATABASE OPEN /* db agent *//* {1:27802:343} */...
NOTE: Deferred communication with ASM instance
NOTE: deferred map free for map id 21
Sun May 16 09:28:35 2021
Shutting down instance (abort)
License high water mark = 3
USER (ospid: 5839): terminating the instance
Instance terminated by USER, pid = 5839
Sun May 16 09:28:35 2021
Instance shutdown complete

首先启动数据库到mount状态：

oracle@testdb1[testdb1]/oracle/diag/rdbms/testdb/testdb1/trace$srvctl start database -d testdb -o mount

oracle@testdb1[testdb1]/oracle/diag/rdbms/testdb/testdb1/trace$crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ACFS01.dg
               ONLINE  ONLINE       testdb1                                      
               ONLINE  ONLINE       testdb2                                      
ora.DGARCH.dg
               ONLINE  ONLINE       testdb1                                      
               ONLINE  ONLINE       testdb2                                      
ora.DGDATA.dg
               ONLINE  ONLINE       testdb1                                      
               ONLINE  ONLINE       testdb2                                      
ora.DGSYS.dg
               ONLINE  ONLINE       testdb1                                      
               ONLINE  ONLINE       testdb2                                      
ora.LISTENER.lsnr
               ONLINE  ONLINE       testdb1                                      
               ONLINE  ONLINE       testdb2                                      
ora.OCR_VOTE.dg
               ONLINE  ONLINE       testdb1                                      
               ONLINE  ONLINE       testdb2                                      
ora.asm
               ONLINE  ONLINE       testdb1                  Started             
               ONLINE  ONLINE       testdb2                  Started             
ora.gsd
               OFFLINE OFFLINE      testdb1                                      
               OFFLINE OFFLINE      testdb2                                      
ora.net1.network
               ONLINE  ONLINE       testdb1                                      
               ONLINE  ONLINE       testdb2                                      
ora.ons
               ONLINE  ONLINE       testdb1                                      
               ONLINE  ONLINE       testdb2                                      
ora.registry.acfs
               ONLINE  ONLINE       testdb1                                      
               ONLINE  ONLINE       testdb2                                      
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       testdb1                                      
ora.cvu
      1        ONLINE  ONLINE       testdb1                                      
ora.immall.db
      1        OFFLINE OFFLINE                               Instance Shutdown   
      2        OFFLINE OFFLINE                               Instance Shutdown   
ora.lisy.db
      1        OFFLINE OFFLINE                               Instance Shutdown   
      2        OFFLINE OFFLINE                               Instance Shutdown   
ora.oc4j
      1        ONLINE  ONLINE       testdb1                                      
ora.scan1.vip
      1        ONLINE  ONLINE       testdb1                                      
ora.testdb.db
      1        ONLINE  INTERMEDIATE testdb1                  Mounted (Closed)    
      2        ONLINE  INTERMEDIATE testdb2                  Mounted (Closed)    
ora.testdb1.vip
      1        ONLINE  ONLINE       testdb1                                      
ora.testdb2.vip
      1        ONLINE  ONLINE       testdb2

我们查看下归档目录：

我配置了两个归档+dgarch 还有两个节点上的/arch,其中数据库放在了redo日志在+DGSYS/testdb/目录下，恢复在一个节点上，建议将/arch的文件放到一个目录下。当然我们现在不用这个来恢复。我们会利用+dgarch的归档，还有redo来进行恢复，步骤如下：

oracle@testdb1[testdb1]/oracle/diag/rdbms/testdb/testdb1/trace$sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Sun May 16 09:31:38 2021
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> archive log list;
Database log mode       Archive Mode
Automatic archival       Enabled
Archive destination       /arch
Oldest online log sequence     158
Next log sequence to archive   159
Current log sequence       159
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
oracle@testdb1[testdb1]/oracle/diag/rdbms/testdb/testdb1/trace$df
Filesystem           1K-blocks     Used Available Use% Mounted on
/dev/sda2             40648268 13076936  25499832  34% /
tmpfs                  4023188   268284   3754904   7% /dev/shm
/dev/sda1               487652    41355    420697   9% /boot
/dev/mapper/vg00-lv_oracle
                      41153856  4595688  34461016  12% /oracle
/dev/mapper/vg00-lv_grid
                      41153856  5092452  33964252  14% /grid
/dev/mapper/vg00-qd_log
                       5029504  1844488   2922872  39% /arch
/dev/asm/dbbak-23      9437184  1197176   8240008  13% /dbbak
/dev/asm/acfs-23      10485760   397436  10088324   4% /odc
oracle@testdb1[testdb1]/oracle/diag/rdbms/testdb/testdb1/trace$cd /arch/
oracle@testdb1[testdb1]/arch$ls
1_100_971729936.dbf  1_121_971729936.dbf  1_137_971729936.dbf  1_153_971729936.dbf  1_63_971729936.dbf  1_75_971729936.dbf  1_87_971729936.dbf   2_56_971729936.dbf  2_83_971729936.dbf
1_101_971729936.dbf  1_122_971729936.dbf  1_138_971729936.dbf  1_158_971729936.dbf  1_64_971729936.dbf  1_76_971729936.dbf  1_88_971729936.dbf   2_58_971729936.dbf  2_84_971729936.dbf
1_102_971729936.dbf  1_125_971729936.dbf  1_139_971729936.dbf  1_53_971729936.dbf   1_65_971729936.dbf  1_77_971729936.dbf  1_89_971729936.dbf   2_59_971729936.dbf  2_85_971729936.dbf
1_103_971729936.dbf  1_126_971729936.dbf  1_140_971729936.dbf  1_54_971729936.dbf   1_66_971729936.dbf  1_78_971729936.dbf  1_90_971729936.dbf   2_61_971729936.dbf  2_89_971729936.dbf
1_106_971729936.dbf  1_127_971729936.dbf  1_143_971729936.dbf  1_55_971729936.dbf   1_67_971729936.dbf  1_79_971729936.dbf  1_91_971729936.dbf   2_62_971729936.dbf  2_90_971729936.dbf
1_107_971729936.dbf  1_128_971729936.dbf  1_144_971729936.dbf  1_56_971729936.dbf   1_68_971729936.dbf  1_80_971729936.dbf  1_92_971729936.dbf   2_66_971729936.dbf  2_95_971729936.dbf
1_115_971729936.dbf  1_129_971729936.dbf  1_145_971729936.dbf  1_57_971729936.dbf   1_69_971729936.dbf  1_81_971729936.dbf  1_93_971729936.dbf   2_67_971729936.dbf  2_96_971729936.dbf
1_116_971729936.dbf  1_130_971729936.dbf  1_148_971729936.dbf  1_58_971729936.dbf   1_70_971729936.dbf  1_82_971729936.dbf  2_108_971729936.dbf  2_78_971729936.dbf  2_97_971729936.dbf
1_117_971729936.dbf  1_131_971729936.dbf  1_149_971729936.dbf  1_59_971729936.dbf   1_71_971729936.dbf  1_83_971729936.dbf  2_109_971729936.dbf  2_79_971729936.dbf  2_98_971729936.dbf
1_118_971729936.dbf  1_132_971729936.dbf  1_150_971729936.dbf  1_60_971729936.dbf   1_72_971729936.dbf  1_84_971729936.dbf  2_111_971729936.dbf  2_80_971729936.dbf  lost+found
1_119_971729936.dbf  1_133_971729936.dbf  1_151_971729936.dbf  1_61_971729936.dbf   1_73_971729936.dbf  1_85_971729936.dbf  2_112_971729936.dbf  2_81_971729936.dbf
1_120_971729936.dbf  1_136_971729936.dbf  1_152_971729936.dbf  1_62_971729936.dbf   1_74_971729936.dbf  1_86_971729936.dbf  2_55_971729936.dbf   2_82_971729936.dbf


oracle@testdb1[testdb1]/arch$sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Sun May 16 09:33:10 2021
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

尝试进行打开一次
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01139: RESETLOGS option only valid after an incomplete database recovery


--尝试recover后，打开数据库，仍然报相同错误
SQL> recover database;
Media recovery complete.

--尝试再次打开
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kcrfr_read_5], [159], [26797], [],
[], [], [], [], [], [], [], []

--查看redo log实际情况，可以看到 group 159,117 是当前的归档
SQL> select group#,sequence#,status,first_time,next_change# from v$log;
    GROUP#  SEQUENCE# STATUS       FIRST_TIME   NEXT_CHANGE#
---------- ---------- ---------------- ------------ ------------
 1  159 CURRENT       15-MAY-21      2.8147E+14
 2  158 INACTIVE       15-MAY-21 3062533
 3  117 CURRENT       15-MAY-21      2.8147E+14
 4  116 INACTIVE       15-MAY-21 3062527
 
SQL> col member for a30
SQL> /

    GROUP# STATUS  TYPE    MEMBER			  IS_
---------- ------- ------- ------------------------------ ---
	 1	   ONLINE  +DGSYS/testdb/redo01.log	  NO
	 2	   ONLINE  +DGSYS/testdb/redo02.log	  NO
	 3	   ONLINE  +DGSYS/testdb/redo03.log	  NO
	 4	   ONLINE  +DGSYS/testdb/redo04.log	  NO
	 

由于服务器异常短电，导致LGWR写联机日志文件时失败，下次重新启动数据库时，需要做实例级恢复，而又无法从联机日志文件里获取到这些redo信息，因为上次断电时，写日志失败了。
--查看当前日志文件情况，从以下查询结果可以看到当前日志组为 159,117 
--恢复数据库，指定redo0.log日志。首先按照它指示的进行恢复，发现/arch/中并不存在117的归档。（之前我已经将其放到了操作节点的/arch中。）
测试如下

SQL> recover database until cancel using backup controlfile;
ORA-00279: change 3104281 generated at 05/15/2021 19:00:50 needed for thread 2
ORA-00289: suggestion : /arch/2_117_971729936.dbf
ORA-00280: change 3104281 for thread 2 is in sequence #117
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/arch/2_117_971729936.dbf                                
ORA-00308: Ϟ·¨´

码农公寓

相关文章