APPLIES TO:
Oracle Database - Enterprise Edition - Version 11.2.0.4 to 11.2.0.4 [Release 11.2]
Information in this document applies to any platform.
SYMPTOMS
In this case, the issue is observed in three different RAC databases, each on 11.2.0.4, that were experiencing excessive PRTMV memory allocation during weekly partition maintenance operations. The PRTMV memory allocated by the instances reached 100G in some cases:
Example:SQL> select name, bytes from v$sgastat where pool = 'shared pool' and (bytes > 999999 or name = 'free memory') order by bytes desc;
NAME BYTES
------------------------------ ---------------------------
PRTMV 120913780976 <----112G
free memory 22767772296
gcs resources 4333239880
gcs shadows 2999935320
SQLA 2451326936
This leds to ORA-4031 errors, the instance being unresponsive, and the need to restart the instance to finish the maintenance jobs.
CAUSE
Each partition maintenance operation allocates PRTMV memory, and each cursor referencing the object also allocates PRTMV memory.
DDLs on the partitioned objects invalidates the existing cursors and new cursors are created, and old versions may not be able to be purged if there are still references to them.
Thus the memory grows and grows until ORA-04031 starts to be seen due to the excessive allocation in PRTMV memory structure.
SOLUTION
The following actions were taken over the course of troubleshooting the issue, and the ORA-4031 errors were resolved with all below steps in place:
1) The shared pool size was increased by 20%.
2) Durations were eliminated (_enable_shared_pool_durations=false)
3) Granule size was set to 32M
4) Patches for the following bugs were applied:
Bug 19461270 - ORA-4031 EXECUTING TRUNCATE PARTITION ON INTERVAL PARTITIONED TABLES
Bug 18953287 - UNKEEP USER-KEPT CURSOR WHEN INVALIDATED
Bug 19614585 - QUERY REPORTED ORA 600 [KKSGAGETNOALLOC_INT0] ON ADG AFTER SWITCHOVER
Bug 20754583 - 'LIBRARY CACHE: MUTEX X' HIT AT RUNNING DBMS_STATS, ISSUE OBSERVED ON OTHER INST. Note! This fix has been superseded by the fix in Bug:28891741. Instead of applying patch 20754583, apply patch 28891741.
Bug 20635353 - ORA-4031 WITH PRTMV HEAP AFTER ALTERING PARTITIONED TABLE
Bug 19689979 - ORA-8103 RUNNING SELECT ON PARTITION TABLE ON RAC NODE
5) Event 20635353 was set to level 5 (needs 20635353 and represents the number of memory heap no-wait latch requests made, with '5' being the maximum):
To set the event:
alter system set event = '20635353 level 3' scope=spfile;
and restart the instance.
6) User activity was minimized during maintenance
7) Partition maintenance operations were reduced by:
a) Not issuing a separate 'drop subpartition' statement for each subpartition in a partition, and instead mass dropping all of the subpartitions in a partition by issuing one 'drop partition' statement
b) Not truncating a partition or subpartition before dropping it
8) Spreading the maintenance operations over different sessions so that there was a better chance of the PRTMV memory mapping to different subpools instead of concentrating in one subpool
Ultimately step 7 was probably most effective - there is a direct correlation between the amount of PRTMV memory allocated and the number of DDL operations executed on the maintained objects. However, all of the steps above were helpful.
我当时就调整了下share pool得大小,
然后每次加2000多个的分区,然后在刷新一下share pool,释放PRTMV占用的内存。