原文博客链接地址:数据库open报错ORA-01555: snapshot too old
今天正在东莞蜜月的时候,一个学生说他管理的测试库出问题了,无法open,我们先来看看是什么问题:
Recovery of Online Redo Log: Thread 1 Group 4 Seq 4 Reading mem 0 Mem# 0: /onlinelog/shr/redo04.log Completed redo application of 0.00MB Completed crash recovery at Thread 1: logseq 4, block 3, scn 7755957 0 data blocks read, 0 data blocks written, 0 redo k-bytes read Thread 1 advanced to log sequence 5 (thread open) Thread 1 opened at log sequence 5 Current log# 5 seq# 5 mem# 0: /onlinelog/shr/redo05.log Successful open of redo thread 1 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Thu Jun 19 13:31:35 2014 SMON: enabling cache recovery ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x0000.007658ba): select ctime, mtime, stime from obj$ where obj# = :1 Errors in file /oracle/diag/rdbms/shr/shr/trace/shr_ora_5262.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_1263032392$" too small Errors in file /oracle/diag/rdbms/shr/shr/trace/shr_ora_5262.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_1263032392$" too small Error 704 happened during db open, shutting down database USER (ospid: 5262): terminating the instance due to error 704 Instance terminated by USER, pid = 5262 ORA-1092 signalled during: ALTER DATABASE OPEN... opiodr aborting process unknown ospid (5262) as a result of ORA-1092 Thu Jun 19 13:31:37 2014 ORA-1092 : opitsk aborting process |
从上面的错误来看,该数据库之所以open失败,是由于Oracle在bootstrap阶段执行递归SQL时出现ora-01555错误,
这样bootstrap过程无法继续下去,也就导致数据库无法open。我们可以看到报错的SQL语句如下:
select ctime, mtime, stime from obj$ where obj# = :1
这是很熟悉的一个SQL,通过10046 trace跟踪Oracle open的过程你会发现该SQL。
针对该错误,或许有人以为是回滚段的问题,实际上并不是,这种情况下推进下SCN 就可以很顺利的把数据库open。
但是这里有个问题:该兄弟的数据库是Oracle 11.2.0.4,已经不支持传统的10015 event的方式了。
下面我们通过oradebug 来解决该问题:
SQL> conn /as sysdba Connected to an idle instance. SQL> startup mount ORACLE instance started. Total System Global Area 4275781632 bytes Fixed Size 2260088 bytes Variable Size 989856648 bytes Database Buffers 3271557120 bytes Redo Buffers 12107776 bytes Database mounted. SQL> SQL> oradebug poke 0x06001AE70 4 0x859AFA ORA-00074: no process has been specified SQL> oradebug setmypid Statement processed. SQL> oradebug DUMPvar SGA kcsgscn_ kcslf kcsgscn_ [06001AE70, 06001AEA0) = 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 6001AB50 00000000 SQL> oradebug poke 0x06001AE70 4 0x859AFA BEFORE: [06001AE70, 06001AE74) = 00000000 AFTER: [06001AE70, 06001AE74) = 00859AFA SQL> alter database open; Database altered. SQL> |
这里简单解释一下,4 为长度,0x859AFA是16进制,我在原来的v$datafile_header.checkpoint_change#的基础之上
加上上1000000得到该值。
我们可以看到,顺利打开了数据库。最后再出观察下alert log发现居然有ora-00600 4194错误。
Thu Jun 19 14:48:43 2014 Dumping diagnostic data in directory=[cdmp_20140619144843], requested by (instance=1, osid=9140 (MMON)), summary=[incident=132122]. Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x3D6C3836] [PC:0x97F4DF6, kgegpa()+40] [flags: 0x0, count: 1] Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x3D6C3836] [PC:0x97F386A, kgebse()+776] [flags: 0x2, count: 2] Thu Jun 19 14:48:46 2014 Flush retried for xcb 0x159c668c8, pmd 0x15870d270 Errors in file /oracle/diag/rdbms/shr/shr/trace/shr_pmon_9112.trc (incident=132017): ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], [] Incident details in: /oracle/diag/rdbms/shr/shr/incident/incdir_132017/shr_pmon_9112_i132017.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Thu Jun 19 14:48:47 2014 Errors in file /oracle/diag/rdbms/shr/shr/trace/shr_ora_9268.trc (incident=132209): ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], [] ORA-00001: unique constraint (SYSTEM.UNQ_PAIRS) violated Incident details in: /oracle/diag/rdbms/shr/shr/incident/incdir_132209/shr_ora_9268_i132209.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Errors in file /oracle/diag/rdbms/shr/shr/trace/shr_pmon_9112.trc: ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], [] PMON (ospid: 9112): terminating the instance due to error 472 System state dump requested by (instance=1, osid=9112 (PMON)), summary=[abnormal instance termination]. System State dumped to trace file /oracle/diag/rdbms/shr/shr/trace/shr_diag_9122_20140619144848.trc Dumping diagnostic data in directory=[cdmp_20140619144848], requested by (instance=1, osid=9112 (PMON)), summary=[abnormal instance termination]. Instance terminated by PMON, pid = 9112 |
这是一个非常常见的错误了,最简单的方式就是通过重建undo或处理回滚段来解决,这里我们用最简单的方式:
修改undo_management=manual undo_tablespace=‘system‘ SQL> l 1* create pfile=‘/tmp/1.ora‘ from spfile SQL> startup mount pfile=‘/tmp/1.ora‘; ORACLE instance started. Total System Global Area 4275781632 bytes Fixed Size 2260088 bytes Variable Size 989856648 bytes Database Buffers 3271557120 bytes Redo Buffers 12107776 bytes Database mounted. SQL> alter database open; Database altered. SQL> select name from v$datafile; NAME -------------------------------------------------------------------------------- /oracle/oradata/shr/system01.dbf /oracle/oradata/shr/sysaux01.dbf /oracle/oradata/shr/undotbs01.dbf /oracle/oradata/shr/users01.dbf /oradata/shr/jy_shr01.dbf /oradata/shr/jy_shr02.dbf /oradata/shr/jy_shr03.dbf /oradata/shr/eas_d_stand01.dbf /oradata/shr/eas_d_stand02.dbf /oradata/shr/eas_d_stand03.dbf /backup/eas/eas_d_stand04.dbf NAME -------------------------------------------------------------------------------- /backup/eas/eas_d_stand05.dbf /backup/eas/eas_d_stand06.dbf 13 rows selected. SQL> show parameter undo NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ undo_management string MANUAL undo_retention integer 900 undo_tablespace string SYSTEM SQL> create undo tablespace undotbs2 datafile ‘/oracle/oradata/shr/undotbs2_01.dbf‘ size 200m; Tablespace created. SQL> shutdown immediate |
重建undo之后,再停库,修改undo参数即可顺利打开数据库,如下:
SQL> create spfile from pfile=‘/tmp/1.ora‘; File created. SQL> startup mount ORACLE instance started. Total System Global Area 4275781632 bytes Fixed Size 2260088 bytes Variable Size 989856648 bytes Database Buffers 3271557120 bytes Redo Buffers 12107776 bytes Database mounted. SQL> alter database open; Database altered. SQL> |
处理完成之后alert log不再抛出任何错误。