library cache lock和cursor: pin S wait on X等待

  1.现象:

客户10.2.0.4 RAC环境,出现大量的library cache lock和cursor: pin S wait on X等待,经分析是由于统计信息收集僵死导致的。
数据库在8点到9点期间,数据库两个节点都存在明显的cursor: pin S wait on X和library cache lock的等待:

  TOP 5 EVENT:

Event

Waits

Time(s)

Avg   Wait(ms)

%   Total Call Time

Wait   Class

cursor:   pin S wait on X

1,573,056

30,651

19

146.2

Concurrency

library   cache lock

31,757

7,009

221

33.4

Concurrency

CPU   time

 

6,416

 

30.6

 

DFS   lock handle

12,381

2,979

241

14.2

Other

latch:   library cache

1,646

1,974

1,199

9.4

Concurrency

Event

Waits

Time(s)

Avg   Wait(ms)

%   Total Call Time

Wait   Class

cursor:   pin S wait on X

17,586,451

342,437

19

812.1

Concurrency

library   cache lock

63,657

30,153

474

71.5

Concurrency

CPU   time

 

3,820

 

9.1

 

db   file sequential read

241,761

1,094

5

2.6

User   I/O

inactive   session

两个节点的等待现象基本一致,而节点1上还存在明显的DFS lock handle等待事件。
通过分析ASH信息,发现library cache lock和cursor: pin S wait on X等待基本上都是6点之后才开始出现:

 2.分析原因

SELECT trunc(sample_time, 'hh24') TIME, COUNT(*)
FROM WRH$_ACTIVE_SESSION_HISTORY ash, wrh$_event_name en
WHERE ash.event_id = en.event_id
AND sample_time>= to_timestamp('', 'yyyymmdd')
AND event_name IN ('cursor: pin S wait on X', 'library cache lock')
GROUP BY trunc(sample_time, 'hh24')
ORDER BY 1;

查询事件发生的情况

而观察出现cursor: pin S wait on X和library cache lock等待会话的BLOCKING SESSION,发现大部分处于等待的会话被会话37锁定:

SELECT to_char(sample_time, 'hh24:mi:ss.ff') TIME, session_id sid,
decode(en.event_name, 'cursor: pin S wait on X', to_number(substr(ltrim(to_char(p2, '0xxxxxxxxxxxxxxx')), 1, 8), 'xxxxxxxx'), blocking_session) b_sid,
en.event_name
FROM WRH$_ACTIVE_SESSION_HISTORY ash, wrh$_event_name en
WHERE ash.event_id = en.event_id(+)
AND instance_number = 1
AND sample_time >= to_timestamp('', 'yyyymmddhh24mi')
AND sample_time <= to_timestamp('', 'yyyymmddhh24mi')
ORDER BY 1;

查询堵塞事件

TIME                      SID      B_SID EVENT_NAME
------------------ ---------- ---------- ------------------------------
08:00:09.569 457 4294967291 latch: library cache
08:00:09.569 52 981 cursor: pin S wait ON X
08:00:09.569 168 73 library cache LOCK
08:00:09.569 1763 4294967291 SQL*Net message FROM client
08:00:09.569 123 4294967295 kksfbc child completion
08:00:09.569 476 4294967291 kst: async disk IO
08:00:09.569 500 4294967292 db file sequential READ
08:00:09.569 327 4294967291 SQL*Net message TO client
08:00:09.569 233 4294967291 db file scattered READ
08:00:09.569 73 73 DFS LOCK handle
08:00:09.569 1889 4294967291 gc CURRENT block 2-way
08:00:09.569 1969 981 cursor: pin S wait ON X
08:00:09.569 857 981 cursor: pin S wait ON X
08:00:09.569 812 168 cursor: pin S wait ON X
08:00:09.569 981 73 library cache LOCK
08:00:19.669 52 981 cursor: pin S wait ON X
08:00:19.669 812 168 cursor: pin S wait ON X
08:00:19.669 857 981 cursor: pin S wait ON X
08:00:19.669 1969 981 cursor: pin S wait ON X
08:00:19.669 233 4294967291 db file scattered READ
08:00:19.669 500 4294967292 db file sequential READ
08:00:19.669 1763 4294967291 db file sequential READ
08:00:19.669 476 4294967291 kst: async disk IO
08:00:19.669 457 4294967291 kksfbc child completion
08:00:19.669 73 73 DFS LOCK handle
08:00:19.669 654 4294967291 SQL*Net message TO client
08:00:19.669 168 73 library cache LOCK
08:00:19.669 981 73 library cache LOCK
08:00:29.769 1969 981 cursor: pin S wait ON X
08:00:29.769 52 981 cursor: pin S wait ON X
08:00:29.769 457 4294967291 kksfbc child completion
08:00:29.769 981 73 library cache LOCK
08:00:29.769 168 73 library cache LOCK
08:00:29.769 500 4294967292 db file sequential READ
08:00:29.769 73 73 DFS LOCK handle
08:00:29.769 654 4294967291 SQL*Net message TO client
08:00:29.769 327 4294967291 SQL*Net message TO client
08:00:29.769 1763 4294967291 db file sequential READ
08:00:29.769 857 981 cursor: pin S wait ON X
08:00:29.769 123 4294967295 latch: library cache
08:00:29.769 196 4294967295 latch: library cache
08:00:29.769 233 4294967295 latch: library cache
08:00:29.769 812 168 cursor: pin S wait ON X
08:00:29.769 476 4294967291 kst: async disk IO
08:00:39.879 476 4294967291 kst: async disk IO
08:00:39.879 658 4294967291 db file sequential READ
08:00:39.879 500 4294967292 db file sequential READ
08:00:39.879 1969 981 cursor: pin S wait ON X
08:00:39.879 857 981 cursor: pin S wait ON X
08:00:39.879 812 168 cursor: pin S wait ON X
08:00:39.879 52 981 cursor: pin S wait ON X
08:00:39.879 981 73 library cache LOCK
08:00:39.879 168 73 library cache LOCK
08:00:39.879 1849 4294967291 SQL*Net message TO client
08:00:39.879 327 4294967291 SQL*Net message TO client
08:00:39.879 235 4294967291 rdbms ipc reply
08:00:39.879 19 4294967291 rdbms ipc reply
08:00:39.879 73 73 DFS LOCK handle
08:00:39.879 123 4294967294 enq: WF - contention
08:00:49.987 476 4294967291 kst: async disk IO
08:00:49.987 500 4294967292 db file sequential READ
08:00:49.987 382 4294967291 gc cr block busy
08:00:49.987 1969 981 cursor: pin S wait ON X
08:00:49.987 857 981 cursor: pin S wait ON X
08:00:49.987 812 168 cursor: pin S wait ON X
08:00:49.987 235 4294967295 kksfbc child completion
08:00:49.987 658 4294967295 latch: library cache
08:00:49.987 19 4294967291 latch: library cache
08:00:49.987 981 73 library cache LOCK
08:00:49.987 168 73 library cache LOCK
08:00:49.987 1889 4294967291 SQL*Net message FROM client
08:00:49.987 73 73 DFS LOCK handle
08:00:49.987 52 981 cursor: pin S wait ON X

查询结果

可以判断,导致数据库实例HANG死的原因是因为SID为73的进程:

SELECT to_char(sample_time, 'hh24:mi:ss'), program, action, event_name
FROM wrh$_active_session_history ash, wrh$_event_name en
WHERE session_id = 73
AND ash.event_id = en.event_id(+)
AND sample_time >= to_date('', 'yyyymmddhh24mi')
AND sample_time <= to_date('', 'yyyymmddhh24mi')
ORDER BY 1;

查询该会话运行的程序

TO_CHAR( PROGRAM                                  ACTION                         EVENT_NAME
-------- ---------------------------------------- ------------------------------ ---------------------------
05:50:07 oracle@orcldb5 (J002) GATHER_STATS_JOB db file sequential READ
05:50:17 oracle@orcldb5 (J002) GATHER_STATS_JOB db file sequential READ
05:50:27 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:50:37 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:50:47 oracle@orcldb5 (J002) GATHER_STATS_JOB gc cr block 2-way
05:50:57 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:51:08 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:51:18 oracle@orcldb5 (J002) GATHER_STATS_JOB db file scattered READ
05:51:28 oracle@orcldb5 (J002) GATHER_STATS_JOB gc cr multi block request
05:51:38 oracle@orcldb5 (J002) GATHER_STATS_JOB direct path WRITE temp
05:51:48 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:51:58 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:52:08 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:52:18 oracle@orcldb5 (J002) GATHER_STATS_JOB db file scattered READ
05:52:28 oracle@orcldb5 (J002) GATHER_STATS_JOB db file sequential READ
05:52:38 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:52:49 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:52:59 oracle@orcldb5 (J002) GATHER_STATS_JOB db file scattered READ
05:53:09 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:53:19 oracle@orcldb5 (J002) GATHER_STATS_JOB db file scattered READ
05:53:29 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:53:39 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:53:49 oracle@orcldb5 (J002) GATHER_STATS_JOB gc CURRENT GRANT 2-way
05:53:59 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:54:09 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:54:20 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:54:30 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:54:40 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:54:50 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:55:00 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:55:10 oracle@orcldb5 (J002) GATHER_STATS_JOB db file sequential READ
05:55:20 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:55:30 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:55:40 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:55:51 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:56:01 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:56:11 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:56:21 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:56:31 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:56:41 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:56:51 oracle@orcldb5 (J002) GATHER_STATS_JOB gc CURRENT GRANT 2-way
05:57:01 oracle@orcldb5 (J002) GATHER_STATS_JOB library cache pin
05:57:11 oracle@orcldb5 (J002) GATHER_STATS_JOB DFS LOCK handle
05:57:21 oracle@orcldb5 (J002) GATHER_STATS_JOB DFS LOCK handle
05:57:32 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:57:42 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:57:52 oracle@orcldb5 (J002) GATHER_STATS_JOB DFS LOCK handle
05:58:02 oracle@orcldb5 (J002) GATHER_STATS_JOB db file sequential READ
05:58:12 oracle@orcldb5 (J002) GATHER_STATS_JOB db file sequential READ
05:58:22 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:58:32 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:58:42 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:58:52 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:59:03 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:59:13 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:59:23 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:59:33 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:59:43 oracle@orcldb5 (J002) GATHER_STATS_JOB db file sequential READ
05:59:53 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
06:00:03 oracle@orcldb5 (J002) GATHER_STATS_JOB DFS LOCK handle
06:00:13 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
06:00:23 oracle@orcldb5 (J002) GATHER_STATS_JOB DFS LOCK handle
06:00:34 oracle@orcldb5 (J002) GATHER_STATS_JOB DFS LOCK handle
06:00:44 oracle@orcldb5 (J002) GATHER_STATS_JOB DFS LOCK handle
06:00:54 oracle@orcldb5 (J002) GATHER_STATS_JOB

查询结果

  根据MOS文档,Bug 6011045 – DBMS_STATS causes deadlock between ‘cursor: pin S wait on X’ and ‘library cache lock’ [ID 6011045.8],这个问题是RAC环境中Oracle收集统计信息的BUG,在自动收集数据字典信息时,可能会出现进程HANG死,并导致大量的library cache lock和cursor: pin S wait on X等待,此外还可能出现DFS lock handle以及row cache lock的等待。
这个BUG确认影响版本包括10.2.0.4和10.2.0.5,这个BUG Oracle在10.2.0.5.5 PSU中进行了修复。

上一篇:深入理解shared pool共享池之library cache的library cache lock系列四


下一篇:rac数据库默认sql tuning advisor,导致大量library cache lock