ora-600 kfdpMetaBlk_pickle 故障处理--惜分飞

联系:手机/微信(+86 13429648788) QQ(107644445)ora-600 kfdpMetaBlk_pickle 故障处理--惜分飞

标题:ora-600 kfdpMetaBlk_pickle 故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户反馈集群的crs无法正常启动观察发现是由于gmon进程crash asm实例导致,经过测试确认是在mount data磁盘组的时候会触发给问题

SQL> alter diskgroup data mount;

alter diskgroup data mount

*

ERROR at line 1:

ORA-03113: end-of-file on communication channel

Process ID: 7517

Session ID: 918 Serial number: 5

对应的alert日志报ORA-600 [kfdpMetaBlk_pickle:01], [4294967295]错误

SQL> alter diskgroup data mount

NOTE: cache registered group DATA number=2 incarn=0x3078f05f

NOTE: cache began mount (first) of group DATA number=2 incarn=0x3078f05f

NOTE: Assigning number (2,1) to disk (/dev/rdisk/disk93)

NOTE: Assigning number (2,3) to disk (/dev/rdisk/disk96)

NOTE: Assigning number (2,2) to disk (/dev/rdisk/disk94)

NOTE: Assigning number (2,0) to disk (/dev/rdisk/disk92)

Sat Jul 17 05:21:01 2021

Errors in file /u01/app/crs_base/diag/asm/+asm/+ASM2/trace/+ASM2_gmon_7457.trc  (incident=255833):

ORA-00600: internal error code, arguments: [kfdpMetaBlk_pickle:01], [4294967295], [0], [], [], [], [], [], [], [], [], []

Incident details in: /u01/app/crs_base/diag/asm/+asm/+ASM2/incident/incdir_255833/+ASM2_gmon_7457_i255833.trc

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /u01/app/crs_base/diag/asm/+asm/+ASM2/trace/+ASM2_gmon_7457.trc:

ORA-00600: internal error code, arguments: [kfdpMetaBlk_pickle:01], [4294967295], [0], [], [], [], [], [], [], [], [], []

GMON (ospid: 7457): terminating the instance due to error 493

Sat Jul 17 05:21:03 2021

System state dump requested by (instance=2, osid=7457 (GMON)), summary=[abnormal instance termination].

System State dumped to trace file /u01/app/crs_base/diag/asm/+asm/+ASM2/trace/+ASM2_diag_7429.trc

Instance terminated by GMON, pid = 7457

对于ORA-600 [kfdpMetaBlk_pickle:01], [4294967295]错误,查询了mos没有任何有效信息
ora-600 kfdpMetaBlk_pickle 故障处理--惜分飞


对应的trace文件发现如下信息

2021-07-17 03:51:16.277603*:800002A2:KGF:kgfdputl.c@1411:kgfdpMetaSet_getMaxClique():   inc=2 ver=4294967295

2021-07-17 03:51:16.277619 :800002A3:KFDP:kfdp.c@9314:kfdpMetaSet_filterOld(): filtered old meta on disk 2

2021-07-17 03:51:16.277620 :800002A4:KFDP:kfdp.c@9314:kfdpMetaSet_filterOld(): filtered old meta on disk 2

2021-07-17 03:51:16.277992 :800002A5:KFDP:kfdp.c@9417:kfdpMetaSet_readDta():kfdpMetaSet_readDta unpickle upto 6 metablks

2021-07-17 03:51:16.277993 :800002A6:KFDP:kfdp.c@9425:kfdpMetaSet_readDta():kfdpMetaSet_readDta unpickle metablk for disk 3

2021-07-17 03:51:16.278154 :800002A7:KFDP:kfdp.c@9425:kfdpMetaSet_readDta():kfdpMetaSet_readDta unpickle metablk for disk 1

2021-07-17 03:51:16.278268 :800002A8:KFDP:kfdp.c@5851:kfdp_read(): kfdp_read end ok=1

2021-07-17 03:51:16.278277 :800002A9:KFDP:kfdp.c@7073:kfdp_doQuery(): kfdp_doQuery   rewrite_kfdp=1

2021-07-17 03:51:16.278282 :800002AA:KFDP:kfdp.c@12511:kfdpLckValue_pickle(): kfdpLckValue_pickle size=0

                            endian=0xff ndisks=0 lckvalid=0

2021-07-17 03:51:16.278293 :800002AB:db_trace:kfdp.c@12803:kfdpLck_convPriv(): [10499:19:396]

                            kfdpLck_conv: grp=1, type=0, mode=5, line=7155

2021-07-17 03:51:16.278294 :800002AC:KFDP:kfdp.c@12663:kfdpLckValue_unpickle(): kfdpLckValue_unpickle

                            size=28 res=0 ok=0 ver=-1 dcnt=0 lckvalid=0 flags=0x2 inst=0 (I am 2) version=0

2021-07-17 03:51:16.278499*:800002AD:KGF:kgfdputl.c@485:kgfdpDta_getAllDsks(): kgfdpDta_getAllDsks using

                            saved iterator 0x9ffffffffd571220 with 4 disks

2021-07-17 03:51:16.278688 :800002AE:KFDP:kfdp.c@5566:kfdp_write(): kfdp_write: pstDskCnt=3 grow=0 degenerate=0

2021-07-17 03:51:16.278688*:800002AF:KGF:kgfdputl.c@2619:kgfdpTraceSet(): writing pst to disks (n=3): 0 1 3

通过删除信息,基本上可以确认由于pst信息异常(pst中记录的只有0 1 3三个磁盘,认为2是老磁盘),但是实际磁盘为4个,导致gmon进程异常.通过底层解决该问题,数据库恢复成功

SQL> recover database using backup controlfile;

ORA-00279: change 30075814973 generated at 07/17/2021 01:12:08 needed for

thread 2

ORA-00289: suggestion : +FRA

ORA-00280: change 30075814973 for thread 2 is in sequence #120561

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

/tmp/asm/group_16

ORA-00279: change 30075814973 generated at 07/17/2021 01:11:54 needed for

thread 1

ORA-00289: suggestion :

+FRA/xff/archivelog/2021_07_17/thread_1_seq_79949.1543.1078103529

ORA-00280: change 30075814973 for thread 1 is in sequence #79949

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

/tmp/asm/group_13

ORA-00279: change 30075815013 generated at 07/17/2021 01:12:09 needed for

thread 1

ORA-00289: suggestion : +FRA

ORA-00280: change 30075815013 for thread 1 is in sequence #79950

ORA-00278: log file '/tmp/asm/group_13' no longer needed for this recovery

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

/tmp/asm/group_11

Log applied.

Media recovery complete.

SQL> alter database open resetlogs;

Database altered.

运气不错,对于该故障的恢复,实现数据0丢失.

上一篇:xhtml、html与html5的区别


下一篇:markdown自动生成侧边栏TOC /目录