从Oracle数据库官方服务支持生命周期表,我们可以清晰看到Oracle11g已过主支持生命周期,2020年后不再支持,取而代之的是12C及以上版本,大部分客户开始了新一轮的数据库升级工作,常见升级目标版本为Oracle19C。本文主要剖析一个升级后的隐藏“福利(BUG)”,希望对大家有所启发。
[分析过程]
某客户系统基于版本迭代要求,将核心数据库Oracle11.2.0.4升级至Oracle19.5版本,割接后的一天小哥刚刚结束一天工作回到家中,突然接收到一条(有且仅有一条)数据库无法链接的告警,出于职业的敏感性和核心库的重要性,小哥第一时间开启电脑接入环境、检查数据库运行情况,发现数据库实例进行了一次快速重启。
通过检查数据库日志,发现提示ORA-04031报错,相关信息如下:
---》alert.log 部分信息
--- alert.log 部分信息
2020-06-20T02:01:38.420323+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_lms3_99464_99479.trc (incident=5532401):
ORA-04031: unable to allocate 168 bytes of shared memory ("shared pool","unknown object","sga heap(3,0)","gcs dynamic shadows lms")
2020-06-20T02:01:38.443599+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ora_43189.trc (incident=5542961):
ORA-04031: unable to allocate 40 bytes of shared memory ("shared pool","SELECT * FROM (SELECT A.work...","SQLA","tmp")
2020-06-20T02:01:38.443611+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ora_114982.trc (incident=5546153):
ORA-04031: ... ... ("shared pool","select 1 from dual","SQLA","tmp")
2020-06-20T02:01:38.443607+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ora_162835.trc (incident=5536633):
ORA-04031: unable to allocate 40 bytes of shared memory ("shared pool","SELECT EXWORKSHTSN,TRIM(INST...","SQLA","tmp")
2020-06-20T02:01:38.443651+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ora_21132.trc (incident=5539833):
ORA-04031: unable to allocate 40 bytes of shared memory ("shared pool","select phone_no from (select...","SQLA","tmp")
向上滑动查看更多内容
进一步检查DUMP文件发现:
--》dump文件
--- dump文件
Analysis collection on ORA-04031 in 'orcl1_ora_90488_i5543897.trc'
=================== Begin Dump 1 ===================
Subpool 1: total 7918845952, top 5 components:
Allocation Name Size Percent
___________________________ ____________ ________
"SQLA " 2338161760 29.53
"KGLH0 " 2276830888 28.75
"KGLHD " 569601504 7.19
"gcs resources " 415678808 5.25
"free memory " 400990320 5.06
Subpool 2: total 7784628224, top 5 components:
Allocation Name Size Percent
___________________________ ____________ ________
"SQLA " 2369048496 30.43
"KGLH0 " 2284414544 29.35
"KGLHD " 572064856 7.35
"gcs resources " 416991392 5.36
"free memory " 407643816 5.24
Subpool 3: total 7650410496, top 5 components:
Allocation Name Size Percent
___________________________ ____________ ________
"gcs dynamic shadows lms " 4623872696 60.44
"free memory " 1226407536 16.03
"gcs resources " 415411008 5.43
"gcs shadows " 223176360 2.92
"kglsim object batch " 93756096 1.23
Subpool 4: total 7516192768, top 5 components:
Allocation Name Size Percent
___________________________ ____________ ________
"SQLA " 2313270192 30.78
"KGLH0 " 2261483192 30.09
"KGLHD " 565488808 7.52
"gcs resources " 415938480 5.53
"free memory " 384745336 5.12
Subpool 5: total 7650410496, top 5 components:
Allocation Name Size Percent
___________________________ ____________ ________
"SQLA " 2366846024 30.94
"KGLH0 " 2289281224 29.92
"KGLHD " 572632304 7.48
"gcs resources " 415937464 5.44
"free memory " 397985672 5.20
Subpool 6: total 7784628224, top 5 components:
Allocation Name Size Percent
___________________________ ____________ ________
"SQLA " 2324705160 29.86
"KGLH0 " 2282608664 29.32
"KGLHD " 572510928 7.35
"gcs resources " 415675760 5.34
"free memory " 394786560 5.07
Subpool 7: total 7381975040, top 5 components:
Allocation Name Size Percent
___________________________ ____________ ________
"SQLA " 2296580576 31.11
"KGLH0 " 2261142832 30.63
"KGLHD " 565777688 7.66
"gcs resources " 416201200 5.64
"free memory " 370771864 5.02
向上滑动查看更多内容
--》Trace File
Trace File
=========
Filename = orcl1_ora_36390_i5541393.trc
Version 19.5.0.0.0
Build label: RDBMS_19.3.0.0.0DBRU_LINUX.X64_190417
ORACLE_HOME: /u01/app/oracle/product/19.0.3/dbhome_1
System name: Linux
Node name:
Release: 3.10.0-957.el7.x86_64
Version: #1 SMP Thu Oct 4 20:48:51 UTC 2018
Machine: x86_64
Instance name: orcl1
Redo thread mounted by this instance: 1
Oracle process number: 1154
Unix process pid: 36390, image: oracle@
*** 2020-06-20T02:01:38.742076+08:00
*** SESSION ID:(8786.52180) 2020-06-20T02:01:38.742092+08:00
*** CLIENT ID:() 2020-06-20T02:01:38.742097+08:00
*** SERVICE NAME:(orcl) 2020-06-20T02:01:38.742102+08:00
*** MODULE NAME:(db_in_workmsg01333@xxxxx(TNS V1-V3)) 2020-06-20T02:01:38.742107+08:00
ORA-04031: unable to allocate 120 bytes of shared memory ("shared pool","INSERT INTO ...","SQLA^b8cce596","strdef: qcopCreateStr")
TOP 10 MEMORY USES FOR SGA HEAP SUB POOL 3
----------------------------------------------
"gcs dynamic shadows lms " 4410 MB 60%
"free memory " 1170 MB 16%
"gcs resources " 396 MB 5%
"gcs shadows " 213 MB 3%
"kglsim object batch " 89 MB 1%
"gcs resv res hash bucket " 75 MB 1%
"db_block_hash_buckets " 74 MB 1%
"KJSC rnb slots " 72 MB 1%
"ges resource dynamic " 69 MB 1%
"ges big msg buffers " 69 MB 1%
-----------------------------------------
free memory 1170 MB
memory alloc. 6126 MB
Sub total 7296 MB
==============================================
TOP 10 MAXIMUM MEMORY USES FOR SGA HEAP SUB POOL 3
----------------------------------------------
"gcs dynamic shadows lms " 4410 MB
"SQLA " 3048 MB
"KGLH0 " 2342 MB
"free memory " 1178 MB
"KGLHD " 553 MB
"gcs resources " 396 MB
"KGLS " 306 MB
"KGLDA " 238 MB
"gcs shadows " 213 MB
"KQR X PO " 117 MB
Comment
========
The allocation in sub pool 3 shows that same situation in file 'orcl1_ora_90488_i5543897.trc'.
向上滑动查看更多内容
初步发现“gcsdynamic shadows lms”组件使用率占比较高,疑似异常。
对比新老库参数设置
--- 新库 NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ _PX_use_large_pool boolean TRUE buffer_pool_keep string buffer_pool_recycle string java_pool_size big integer 1G large_pool_size big integer 10G memoptimize_pool_size big integer 0 olap_page_pool_size big integer 0 shared_pool_reserved_size big integer 2560M shared_pool_size big integer 50G streams_pool_size big integer 8G pga_aggregate_limit big integer 150G pga_aggregate_target big integer 75G --- 老库 NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ _PX_use_large_pool boolean TRUE buffer_pool_keep string buffer_pool_recycle string global_context_pool_size string java_pool_size big integer 1G large_pool_size big integer 5G olap_page_pool_size big integer 0 shared_pool_reserved_size big integer 768M shared_pool_size big integer 15G streams_pool_size big integer 1536M pga_aggregate_target big integer 25G |
发现在相同业务情况下新库参数值远大于老库参数,依旧提示ORA-04031,初步怀疑OracleBUG。
BUG排查:
通过查阅OracleMOS支持文档、疑似命中BUG:Bug30223374 - Memory Leak Due to "gcs dynamic shadows lms"(Doc ID 30223374.8),描述信息如下:
Bug30223374已被取代,随即查阅替代BUG:
检查当前数据库应用补丁应用情况:
2020-06-20T02:02:38.171124+08:00
===========================================================
Dumping current patch information
===========================================================
Patch Id: 30122149
Patch Description: OCW RELEASE UPDATE 19.5.0.0.0 (30122149)
Patch Id: 30125133
Patch Description: Database Release Update : 19.5.0.0.191015 (30125133)
(includes) ... 29452936 ...
Patch Id: 30128191
Patch Description: OJVM RELEASE UPDATE: 19.5.0.0.191015 (30128191)
Patch Id: 30396974
Patch Description: REBALANCE ABORTING WITH ORA-15424
Patch Id: 26724511 Patch Description: AUTO OPTIMIZER STATS RUN MULTIPLE JOBS DURING MAINTENANCE WINDOWS
Patch Id: 28589509 Patch Description: ORA 600 [KCL_SHRINK_ANTI_BG_1] AND INSTANCE CRASHED
Patch Id: 28681153 Patch Description: INTERNAL PDCDB ORA-00600 [QOSDEXPSTATREAD EXPCNT MISMATCH]
Patch Id: 28751498
Patch Id: 29182901
Patch Id: 29312889
Patch Id: 29717901
Patch Id: 29965888
Patch Id: 30186706
Patch Id: 30157766
Patch Id: 31190412
Patch Description: MERGE ON DATABASE RU 19.5.0.0.0 OF 28572407 30614411
向上滑动查看更多内容
修复方式:
当前BUG30223374已被BUG30318638取代,可通过如下几种方式来修复:
1. Apply one-off patch 30318638 on top of 19.5 or 2. Upgrade with 19.5.1.0 (Jan 2020) DB Release Update Revision(DB RUR) or 3. Upgrade with 19.6.0.0.200114 (Jan 2020) Database Release Update (DB RU) |
现场决定采取第1种方案进行修复,修复后无重启发生。
[总结]
随着软硬件的不断升级迭代,主机、数据库等重启时间也在不断缩短,作为一线运维小哥要时刻保持对生产的敬畏之心,不能忽略任何一次的告警提示,遇到故障时要及时汇总整理故障诊断信息,保持清晰处理思路,从而达到事半功倍的效果。