环境:Solaris 10 + Oracle 11.2.0.1
现象:alert告警日志定期出现ORA-600 [13011]错误
1.故障现象
数据库alert 日志:
Fri Jul 13 02:00:00 2018
Clearing Resource Manager plan via parameter
Fri Jul 13 02:00:46 2018
Errors in file /u01/app/oracle/diag/rdbms/prod/prod/trace/prod_j000_1757.trc (incident=42249):
ORA-00600: internal error code, arguments: [13011], [67896], [8421830], [44], [8421830], [17], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/prod/prod/incident/incdir_42249/prod_j000_1757_i42249.trc
Fri Jul 13 02:00:49 2018
Trace dumping is performing id=[cdmp_20180713020049]
Fri Jul 13 02:00:49 2018
Sweep [inc][42249]: completed
Sweep [inc2][42249]: completed
Fri Jul 13 03:00:55 2018
Errors in file /u01/app/oracle/diag/rdbms/prod/prod/trace/prod_j000_2053.trc (incident=42250):
ORA-00600: internal error code, arguments: [13011], [67896], [8421830], [44], [8447731], [0], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/prod/prod/incident/incdir_42250/prod_j000_2053_i42250.trc
Fri Jul 13 03:00:56 2018
Trace dumping is performing id=[cdmp_20180713030056]
Fri Jul 13 03:01:48 2018
Sweep [inc][42250]: completed
Sweep [inc2][42250]: completed
Fri Jul 13 04:00:03 2018
Errors in file /u01/app/oracle/diag/rdbms/prod/prod/trace/prod_j000_2338.trc (incident=42251):
ORA-00600: internal error code, arguments: [13011], [67896], [8421830], [44], [8421830], [0], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/prod/prod/incident/incdir_42251/prod_j000_2338_i42251.trc
Fri Jul 13 04:00:05 2018
Trace dumping is performing id=[cdmp_20180713040005]
Fri Jul 13 04:00:05 2018
Sweep [inc][42251]: completed
Sweep [inc2][42251]: completed
省略部分相似输出..
Fri Jul 13 09:00:50 2018
Errors in file /u01/app/oracle/diag/rdbms/prod/prod/trace/prod_j000_3816.trc (incident=42273):
ORA-00600: internal error code, arguments: [13011], [67896], [8421830], [44], [8421827], [0], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/prod/prod/incident/incdir_42273/prod_j000_3816_i42273.trc
Fri Jul 13 09:00:53 2018
Trace dumping is performing id=[cdmp_20180713090053]
Fri Jul 13 09:01:39 2018
Sweep [inc][42273]: completed
2.初步分析
从/u01/app/oracle/diag/rdbms/prod/prod/trace/prod_j000_1757.trc文件中没有过多信息:
Incident 42249 created, dump file: /u01/app/oracle/diag/rdbms/prod/prod/incident/incdir_42249/prod_j000_1757_i42249.trc
ORA-00600: internal error code, arguments: [13011], [67896], [8421830], [44], [8421830], [17], [], [], [], [], [], []
从/u01/app/oracle/diag/rdbms/prod/prod/incident/incdir_42249/prod_j000_1757_i42249.trc文件中可以看到Current SQL:
Dump continued from file: /u01/app/oracle/diag/rdbms/prod/prod/trace/prod_j000_1757.trc
ORA-00600: internal error code, arguments: [13011], [67896], [8421830], [44], [8421830], [17], [], [], [], [], [], []
========= Dump for incident 42249 (ORA 600 [13011]) ========
*** 2018-07-13 02:00:46.871
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=11p815z8hkfms) -----
DELETE MGMT_SYSTEM_PERFORMANCE_LOG WHERE TIME < :B2 AND ROWNUM <= :B1
3.匹配MOS
搜索MOS可以找到以下两篇文档:
- ORA-00600 [13011] (文档 ID 1547827.1)
- ORA-600 [13011] "Problem occurred when trying to delete a row" (文档 ID 28184.1)
从文档 ID 1547827.1中可以匹配到现象Current SQL一致:
DELETE MGMT_SYSTEM_PERFORMANCE_LOG WHERE TIME < :B2 AND ROWNUM <= :B1
MOS中给出的解决方案,
SOLUTION
1)As a possible solution, please apply patch set 11.2.0.3
- OR -
2)Drop and recreate the index(es) on SYSMAN.MGMT_SYSTEM_PERFORMANCE_LOG.
To drop and recreate the indexes:A. Get the DDL for recreating the indexes:
set heading off
set echo off
set flush off
set pagesize 9999
set linesize 9999
set long 9999
SQL> select dbms_metadata.get_ddl('INDEX', 'MGMT_SYSTEM_PERF_LOG_IDX_01', 'SYSMAN') from dual;
SQL> select dbms_metadata.get_ddl('INDEX', 'MGMT_SYSTEM_PERF_LOG_IDX_02', 'SYSMAN') from dual;B. Drop the index:
drop index sysman.MGMT_SYSTEM_PERF_LOG_IDX_01;
drop index sysman.MGMT_SYSTEM_PERF_LOG_IDX_02;C. Recreate the indexes using the output from step 2A.
3)Run validate against the table and its indexes once more:
SQL> analyze table sysman.MGMT_SYSTEM_PERFORMANCE_LOG validate structure cascade;
4.定位解决
这里选择MOS中第二种解决方案重建MGMT_SYSTEM_PERFORMANCE_LOG索引:
--4.1 查看MGMT_SYSTEM_PERFORMANCE_LOG的索引
SQL> select index_name from dba_indexes where table_name = 'MGMT_SYSTEM_PERFORMANCE_LOG';
MGMT_SYSTEM_PERF_LOG_IDX_01
MGMT_SYSTEM_PERF_LOG_IDX_02
--4.2 获取MGMT_SYSTEM_PERFORMANCE_LOG的2个索引的DDL语句
SQL> select dbms_metadata.get_ddl('INDEX', 'MGMT_SYSTEM_PERF_LOG_IDX_01', 'SYSMAN') from dual;
SQL> select dbms_metadata.get_ddl('INDEX', 'MGMT_SYSTEM_PERF_LOG_IDX_02', 'SYSMAN') from dual;
--4.3 删除之前的2个索引
SQL> drop index sysman.MGMT_SYSTEM_PERF_LOG_IDX_01;
SQL> drop index sysman.MGMT_SYSTEM_PERF_LOG_IDX_02;
--4.4 重新创建2个索引
SQL> CREATE INDEX "SYSMAN"."MGMT_SYSTEM_PERF_LOG_IDX_01" ON "SYSMAN"."MGMT_SYSTEM_PERFORMANCE_LOG" ("JOB_NAME", "TIME", "NAME")
PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS COMPRESS 3
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
TABLESPACE "SYSAUX";
SQL> CREATE INDEX "SYSMAN"."MGMT_SYSTEM_PERF_LOG_IDX_02" ON "SYSMAN"."MGMT_SYSTEM_PERFORMANCE_LOG" ("TIME")
PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
TABLESPACE "SYSAUX";
--4.5 重新分析表和索引
SQL> analyze table sysman.MGMT_SYSTEM_PERFORMANCE_LOG validate structure cascade;
Table analyzed.
至此,完成索引重建,后续定期观察alert日志。