由于笔记本出现没有插电源,虚拟机不正常关机。再次启动数据库的时候出现故障。testdb数据库启动不了,报错如下:
$ORACLE_SID: testdb1 $ORACLE_HOME: /oracle/db11g $GRID_HOME: /grid/grid_home oracle@testdb1[testdb1]/home/oracle$srvctl start database -d testdb PRCR-1079 : Failed to start resource ora.testdb.db CRS-5017: The resource action "ora.testdb.db start" encountered the following error: ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [159], [26797], [26821], [], [], [], [], [], [], [] . For details refer to "(:CLSN00107:)" in "/grid/grid_home/log/testdb2/agent/crsd/oraagent_oracle/oraagent_oracle.log". CRS-5017: The resource action "ora.testdb.db start" encountered the following error: ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [159], [26797], [26821], [], [], [], [], [], [], [] . For details refer to "(:CLSN00107:)" in "/grid/grid_home/log/testdb1/agent/crsd/oraagent_oracle/oraagent_oracle.log". CRS-2674: Start of 'ora.testdb.db' on 'testdb2' failed CRS-2674: Start of 'ora.testdb.db' on 'testdb1' failed CRS-2632: There are no more servers to try to place resource 'ora.testdb.db' on that would satisfy its placement policy
查看日志:
oracle@testdb1[testdb1]/home/oracle$cd /oracle/diag/rdbms/testdb/testdb1/trace oracle@testdb1[testdb1]/oracle/diag/rdbms/testdb/testdb1/trace$tail -100 alert_testdb1.log lmon registered with NM - instance number 1 (internal mem no 0) Reconfiguration started (old inc 0, new inc 3) List of instances: 1 2 (myinst: 1) Global Resource Directory frozen * allocate domain 0, invalid = TRUE Communication channels reestablished Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Set master node info Submitted all remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted Post SMON to start 1st pass IR Submitted all GCS remote-cache requests Post SMON to start 1st pass IR Fix write in gcs resources Reconfiguration complete Sun May 16 09:28:21 2021 LCK0 started with pid=28, OS id=5737 Starting background process RSMN Sun May 16 09:28:21 2021 RSMN started with pid=29, OS id=5743 ORACLE_BASE not set in environment. It is recommended that ORACLE_BASE be set in the environment Reusing ORACLE_BASE from an earlier startup = /oracle Sun May 16 09:28:22 2021 ALTER SYSTEM SET local_listener=' (ADDRESS=(PROTOCOL=TCP)(HOST=172.16.0.11)(PORT=1521))' SCOPE=MEMORY SID='testdb1'; ALTER DATABASE MOUNT /* db agent *//* {1:27802:343} */ Sun May 16 09:28:24 2021 Sweep [inc][59045]: completed Sweep [inc2][59045]: completed NOTE: Loaded library: System SUCCESS: diskgroup DGSYS was mounted NOTE: dependency between database testdb and diskgroup resource ora.DGSYS.dg is established Sun May 16 09:28:32 2021 Successful mount of redo thread 1, with mount id 2855981366 Sun May 16 09:28:32 2021 Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE) Lost write protection disabled Completed: ALTER DATABASE MOUNT /* db agent *//* {1:27802:343} */ ALTER DATABASE OPEN /* db agent *//* {1:27802:343} */ This instance was first to open Sun May 16 09:28:33 2021 SUCCESS: diskgroup DGDATA was mounted SUCCESS: diskgroup DGARCH was mounted Block change tracking file is current. Sun May 16 09:28:33 2021 NOTE: dependency between database testdb and diskgroup resource ora.DGDATA.dg is established Beginning crash recovery of 2 threads NOTE: dependency between database testdb and diskgroup resource ora.DGARCH.dg is established Sun May 16 09:28:34 2021 Reconfiguration started (old inc 3, new inc 5) List of instances: 1 (myinst: 1) Global Resource Directory frozen * dead instance detected - domain 0 invalid = TRUE Communication channels reestablished Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out Sun May 16 09:28:34 2021 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Set master node info Submitted all remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted Post SMON to start 1st pass IR Submitted all GCS remote-cache requests Post SMON to start 1st pass IR Fix write in gcs resources Reconfiguration complete parallel recovery started with 2 processes Started redo scan Completed redo scan read 3196 KB redo, 935 data blocks need recovery Errors in file /oracle/diag/rdbms/testdb/testdb1/trace/testdb1_ora_5745.trc (incident=60246): ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [159], [26797], [26821], [], [], [], [], [], [], [] Incident details in: /oracle/diag/rdbms/testdb/testdb1/incident/incdir_60246/testdb1_ora_5745_i60246.trc Sun May 16 09:28:35 2021 Dumping diagnostic data in directory=[cdmp_20210516092835], requested by (instance=1, osid=5745), summary=[incident=60246]. Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Abort recovery for domain 0 Aborting crash recovery due to error 600 Errors in file /oracle/diag/rdbms/testdb/testdb1/trace/testdb1_ora_5745.trc: ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [159], [26797], [26821], [], [], [], [], [], [], [] Abort recovery for domain 0 Errors in file /oracle/diag/rdbms/testdb/testdb1/trace/testdb1_ora_5745.trc: ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [159], [26797], [26821], [], [], [], [], [], [], [] ORA-600 signalled during: ALTER DATABASE OPEN /* db agent *//* {1:27802:343} */... NOTE: Deferred communication with ASM instance NOTE: deferred map free for map id 21 Sun May 16 09:28:35 2021 Shutting down instance (abort) License high water mark = 3 USER (ospid: 5839): terminating the instance Instance terminated by USER, pid = 5839 Sun May 16 09:28:35 2021 Instance shutdown complete
首先启动数据库到mount状态:
oracle@testdb1[testdb1]/oracle/diag/rdbms/testdb/testdb1/trace$srvctl start database -d testdb -o mount
oracle@testdb1[testdb1]/oracle/diag/rdbms/testdb/testdb1/trace$crsctl stat res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.ACFS01.dg ONLINE ONLINE testdb1 ONLINE ONLINE testdb2 ora.DGARCH.dg ONLINE ONLINE testdb1 ONLINE ONLINE testdb2 ora.DGDATA.dg ONLINE ONLINE testdb1 ONLINE ONLINE testdb2 ora.DGSYS.dg ONLINE ONLINE testdb1 ONLINE ONLINE testdb2 ora.LISTENER.lsnr ONLINE ONLINE testdb1 ONLINE ONLINE testdb2 ora.OCR_VOTE.dg ONLINE ONLINE testdb1 ONLINE ONLINE testdb2 ora.asm ONLINE ONLINE testdb1 Started ONLINE ONLINE testdb2 Started ora.gsd OFFLINE OFFLINE testdb1 OFFLINE OFFLINE testdb2 ora.net1.network ONLINE ONLINE testdb1 ONLINE ONLINE testdb2 ora.ons ONLINE ONLINE testdb1 ONLINE ONLINE testdb2 ora.registry.acfs ONLINE ONLINE testdb1 ONLINE ONLINE testdb2 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE testdb1 ora.cvu 1 ONLINE ONLINE testdb1 ora.immall.db 1 OFFLINE OFFLINE Instance Shutdown 2 OFFLINE OFFLINE Instance Shutdown ora.lisy.db 1 OFFLINE OFFLINE Instance Shutdown 2 OFFLINE OFFLINE Instance Shutdown ora.oc4j 1 ONLINE ONLINE testdb1 ora.scan1.vip 1 ONLINE ONLINE testdb1 ora.testdb.db 1 ONLINE INTERMEDIATE testdb1 Mounted (Closed) 2 ONLINE INTERMEDIATE testdb2 Mounted (Closed) ora.testdb1.vip 1 ONLINE ONLINE testdb1 ora.testdb2.vip 1 ONLINE ONLINE testdb2
我们查看下归档目录:
我配置了两个归档+dgarch 还有 两个节点上的/arch,其中数据库放在了redo日志在+DGSYS/testdb/目录下,恢复在一个节点上,建议将/arch的文件放到一个目录下。当然我们现在不用这个来恢复。我们会利用+dgarch的归档,还有redo来进行恢复,步骤如下:
oracle@testdb1[testdb1]/oracle/diag/rdbms/testdb/testdb1/trace$sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Sun May 16 09:31:38 2021 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP, Data Mining and Real Application Testing options SQL> archive log list; Database log mode Archive Mode Automatic archival Enabled Archive destination /arch Oldest online log sequence 158 Next log sequence to archive 159 Current log sequence 159 SQL> exit Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP, Data Mining and Real Application Testing options oracle@testdb1[testdb1]/oracle/diag/rdbms/testdb/testdb1/trace$df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda2 40648268 13076936 25499832 34% / tmpfs 4023188 268284 3754904 7% /dev/shm /dev/sda1 487652 41355 420697 9% /boot /dev/mapper/vg00-lv_oracle 41153856 4595688 34461016 12% /oracle /dev/mapper/vg00-lv_grid 41153856 5092452 33964252 14% /grid /dev/mapper/vg00-qd_log 5029504 1844488 2922872 39% /arch /dev/asm/dbbak-23 9437184 1197176 8240008 13% /dbbak /dev/asm/acfs-23 10485760 397436 10088324 4% /odc oracle@testdb1[testdb1]/oracle/diag/rdbms/testdb/testdb1/trace$cd /arch/ oracle@testdb1[testdb1]/arch$ls 1_100_971729936.dbf 1_121_971729936.dbf 1_137_971729936.dbf 1_153_971729936.dbf 1_63_971729936.dbf 1_75_971729936.dbf 1_87_971729936.dbf 2_56_971729936.dbf 2_83_971729936.dbf 1_101_971729936.dbf 1_122_971729936.dbf 1_138_971729936.dbf 1_158_971729936.dbf 1_64_971729936.dbf 1_76_971729936.dbf 1_88_971729936.dbf 2_58_971729936.dbf 2_84_971729936.dbf 1_102_971729936.dbf 1_125_971729936.dbf 1_139_971729936.dbf 1_53_971729936.dbf 1_65_971729936.dbf 1_77_971729936.dbf 1_89_971729936.dbf 2_59_971729936.dbf 2_85_971729936.dbf 1_103_971729936.dbf 1_126_971729936.dbf 1_140_971729936.dbf 1_54_971729936.dbf 1_66_971729936.dbf 1_78_971729936.dbf 1_90_971729936.dbf 2_61_971729936.dbf 2_89_971729936.dbf 1_106_971729936.dbf 1_127_971729936.dbf 1_143_971729936.dbf 1_55_971729936.dbf 1_67_971729936.dbf 1_79_971729936.dbf 1_91_971729936.dbf 2_62_971729936.dbf 2_90_971729936.dbf 1_107_971729936.dbf 1_128_971729936.dbf 1_144_971729936.dbf 1_56_971729936.dbf 1_68_971729936.dbf 1_80_971729936.dbf 1_92_971729936.dbf 2_66_971729936.dbf 2_95_971729936.dbf 1_115_971729936.dbf 1_129_971729936.dbf 1_145_971729936.dbf 1_57_971729936.dbf 1_69_971729936.dbf 1_81_971729936.dbf 1_93_971729936.dbf 2_67_971729936.dbf 2_96_971729936.dbf 1_116_971729936.dbf 1_130_971729936.dbf 1_148_971729936.dbf 1_58_971729936.dbf 1_70_971729936.dbf 1_82_971729936.dbf 2_108_971729936.dbf 2_78_971729936.dbf 2_97_971729936.dbf 1_117_971729936.dbf 1_131_971729936.dbf 1_149_971729936.dbf 1_59_971729936.dbf 1_71_971729936.dbf 1_83_971729936.dbf 2_109_971729936.dbf 2_79_971729936.dbf 2_98_971729936.dbf 1_118_971729936.dbf 1_132_971729936.dbf 1_150_971729936.dbf 1_60_971729936.dbf 1_72_971729936.dbf 1_84_971729936.dbf 2_111_971729936.dbf 2_80_971729936.dbf lost+found 1_119_971729936.dbf 1_133_971729936.dbf 1_151_971729936.dbf 1_61_971729936.dbf 1_73_971729936.dbf 1_85_971729936.dbf 2_112_971729936.dbf 2_81_971729936.dbf 1_120_971729936.dbf 1_136_971729936.dbf 1_152_971729936.dbf 1_62_971729936.dbf 1_74_971729936.dbf 1_86_971729936.dbf 2_55_971729936.dbf 2_82_971729936.dbf oracle@testdb1[testdb1]/arch$sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Sun May 16 09:33:10 2021 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP, Data Mining and Real Application Testing options 尝试进行打开一次 SQL> alter database open resetlogs; alter database open resetlogs * ERROR at line 1: ORA-01139: RESETLOGS option only valid after an incomplete database recovery --尝试recover后,打开数据库,仍然报相同错误 SQL> recover database; Media recovery complete. --尝试再次打开 SQL> alter database open; alter database open * ERROR at line 1: ORA-00600: internal error code, arguments: [kcrfr_read_5], [159], [26797], [], [], [], [], [], [], [], [], [] --查看redo log实际情况,可以看到 group 159,117 是当前的归档 SQL> select group#,sequence#,status,first_time,next_change# from v$log; GROUP# SEQUENCE# STATUS FIRST_TIME NEXT_CHANGE# ---------- ---------- ---------------- ------------ ------------ 1 159 CURRENT 15-MAY-21 2.8147E+14 2 158 INACTIVE 15-MAY-21 3062533 3 117 CURRENT 15-MAY-21 2.8147E+14 4 116 INACTIVE 15-MAY-21 3062527 SQL> col member for a30 SQL> / GROUP# STATUS TYPE MEMBER IS_ ---------- ------- ------- ------------------------------ --- 1 ONLINE +DGSYS/testdb/redo01.log NO 2 ONLINE +DGSYS/testdb/redo02.log NO 3 ONLINE +DGSYS/testdb/redo03.log NO 4 ONLINE +DGSYS/testdb/redo04.log NO 由于服务器异常短电,导致LGWR写联机日志文件时失败,下次重新启动数据库时,需要做实例级恢复,而又无法从联机日志文件里获取到这些redo信息,因为上次断电时,写日志失败了。 --查看当前日志文件情况,从以下查询结果可以看到当前日志组为 159,117 --恢复数据库,指定redo0.log日志。首先按照它指示的进行恢复,发现/arch/中并不存在117的归档。(之前我已经将其放到了操作节点的/arch中。) 测试如下 SQL> recover database until cancel using backup controlfile; ORA-00279: change 3104281 generated at 05/15/2021 19:00:50 needed for thread 2 ORA-00289: suggestion : /arch/2_117_971729936.dbf ORA-00280: change 3104281 for thread 2 is in sequence #117 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} /arch/2_117_971729936.dbf ORA-00308: Ϟ·¨´