平台环境是:solaris操作系统
数据库是:oracle 10.2.0.4
故障现象:2015年10月份以来,一综合网管数据库频频报出ORA-31693、ORA-02354、ORA-02149错误,具体报错信息如下所示:
. . 导出了 "SSMS"."CHECKWHITELIST" 5.882 KB 1 行
ORA-31693: 表数据对象 "SSMS"."RES_SSMSKEYWCONF":"P_20121207000000" 无法加载/卸载并且被跳过, 错误如下:
ORA-02354: 导出/导入数据时出错
ORA-02149: 指定的分区不存在
. . 导出了 "SSMS"."RES_SSMSKEYWCONF":"P_20121208000000" 1.481 MB 17506 行
起初处理,用如下sql语句从数据库查询 "SSMS"."RES_SSMSKEYWCONF":"P_20121207000000"分区,没有查询到,可能相关的分区表的分区已经不存在了。于是,建议业务方手工执行分区表分区清理命令,更新数据库字典信息。
select table_owner,table_name,partition_name from DBA_TAB_PARTITIONS where table_name='RES_SSMSHOLDCONF' and partition_name='P_20121207000000';
处理完后,第二天相关这个分区表的expdp备份的报错果然没有了,但是又出现新的分区表分区备份报错,报错信息跟上述症状一致:
. . 导出了 "SSMS"."BLACK_WHITE_LOG":"P_20151014000000" 3.270 MB 36055 行
ORA-31693: 表数据对象 "SSMS"."RES_SSMSHOLDCONF":"P_20121207000000" 无法加载/卸载并且被跳过, 错误如下:
ORA-02354: 导出/导入数据时出错
ORA-02149: 指定的分区不存在
. . 导出了 "SSMS"."RES_SSMSHOLDCONF":"P_20121208000000" 3.178 MB 29845 行
. . 导出了 "SSMS"."RES_DIM_PROVINCE_DBLINK" 5.835 KB 16 行
ORA-31693: 表数据对象 "SSMS"."RES_SSMSWHITELIST":"P_20121207000000" 无法加载/卸载并且被跳过, 错误如下:
ORA-02354: 导出/导入数据时出错
ORA-02149: 指定的分区不存在
. . 导出了 "SSMS"."RES_SSMSWHITELIST":"P_20121208000000" 129.2 KB 1380 行
看来,这种expdp报错不是简单的数据库字典问题。我怀疑是数据库的expdp备份开始后到结束前,有人或者自动作业清理分区表分区,这样才能每天都报备份分区不存在,而且每天报的分区表分区不同;通过与业务、开发方沟通得知,他们确实有分区表分区清理job且开始时间是每天零点,另经查看数据库expdp备份计划也是每天零点开始,这符合自己的猜测。于是我按照自己的猜测思路进行了模拟实验,具体的步奏如下:
1、将上述报错涉及的一张分区表导出,观察备份过程,备份成功没有报错或异常,然后导入到实验环境
2、在实验环境进行备份
[oracle@orcl11204 ~]$ date
Fri Oct 30 17:46:52 CST 2015
[oracle@orcl11204 ~]$ expdp system/redhat5 tables=RES_SSMSWHITELIST SCHEMAS=ssms directory=TEXPDP dumpfile=exp_20`date +%y
y%m%d`1.dmp filesize=3000000000 logfile=exp_20`date +%y%m%d`.logdirectory=TEXPDP dumpfile=exp_20`date +%y%m%d`full=y directory=TEXPDP dumpfile=exp_20`date +%y%m%d`1.dmp filesize=cd
expdp system/redhat5 tables=ssms.RES_SSMSWHITELIST directory=TEXPDP dumpfile=exp_20`date +%y%m%d`1.
dmp filesize=3000000000 logfile=exp_20`date +%y%m%d`.log
Export: Release 11.2.0.4.0 - Production on Fri Oct 30 17:46:30 2015
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SYSTEM"."SYS_EXPORT_TABLE_01": system/******** tables=ssms.RES_SSMSWHITELIST directory=TEXPDP dumpfile=exp_201510301.dmp filesize=3000000000 logfile=exp_20151030.log
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 2.021 GB
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/INDEX/INDEX
Processing object type TABLE_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
Processing object type TABLE_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
. . exported "SSMS"."RES_SSMSWHITELIST":"P_20131001000000" 37.50 MB 357138 rows
. . exported "SSMS"."RES_SSMSWHITELIST":"P_20130901000000" 36.52 MB 347839 rows
. . exported "SSMS"."RES_SSMSWHITELIST":"P_20150401000000" 33.56 MB 297716 rows
. . exported "SSMS"."RES_SSMSWHITELIST":"P_20150801000000" 33.30 MB 295103 rows
. . exported "SSMS"."RES_SSMSWHITELIST":"P_20150901000000" 33.48 MB 296620 rows
.
.
.
3、另开窗口,进行分区表删除操作
[oracle@orcl11204 ~]$ date
Fri Oct 30 17:49:47 CST 2015
.
.
.
SQL> show user
USER is "SSMS"
SQL> alter table RES_SSMSWHITELIST drop partition P_20130826000000;
Table altered.
SQL> alter table RES_SSMSWHITELIST drop partition P_20150425000000;
Table altered.
SQL> alter table RES_SSMSWHITELIST drop partition P_20150116000000;
Table altered.
4、观察备份过程,果然出现了生产环境中的报错
. . exported "SSMS"."RES_SSMSWHITELIST":"P_20130824000000" 14.24 KB 60 rows
ORA-31693: Table data object "SSMS"."RES_SSMSWHITELIST":"P_20130826000000" failed to load/unload and is being skipped due to error:
ORA-02354: error in exporting/importing data
ORA-02149: Specified partition does not exist
. . exported "SSMS"."RES_SSMSWHITELIST":"P_20130827000000" 8.835 KB 9 rows
. . exported "SSMS"."RES_SSMSWHITELIST":"P_20150115000000" 21.14 KB 120 rows
ORA-31693: Table data object "SSMS"."RES_SSMSWHITELIST":"P_20150116000000" failed to load/unload and is being skipped due to error:
ORA-02354: error in exporting/importing data
ORA-02149: Specified partition does not exist
. . exported "SSMS"."RES_SSMSWHITELIST":"P_20150117000000" 11.27 KB 30 rows
. . exported "SSMS"."RES_SSMSWHITELIST":"P_20150423000000" 9.281 KB 12 rows
ORA-31693: Table data object "SSMS"."RES_SSMSWHITELIST":"P_20150425000000" failed to load/unload and is being skipped due to error:
ORA-02354: error in exporting/importing data
ORA-02149: Specified partition does not exist
. . exported "SSMS"."RES_SSMSWHITELIST":"P_20150426000000" 9.765 KB 17 rows
总结以上实验:oracle数据库expdp备份开始到结束期间,人为的或自动作业删除分区表分区,会导致分区表分区备份不到而报错,这不是数据库字典问题,也不是oracle自身的bug,是业务的分区表分区清理Job任务与数据库expdp备份作业之间的逻辑问题。
需要作出的调整是(二者选一)
1、在备份开始前进行分区表清理并清理结束
2、在备份结束后进行分区表清理