oracle如何处理行锁

问题现象

##### 统一监控告警平台-DB: [触发]

> `告警集群:` jq-prod-db-prometheus
> `告警描述:` 数据库Oracle堵塞会话超过5分钟,会话ID:6055
> `告警环境:` prod
> `告警类型:` oracle_prod
> `告警实例:` nbadb02
> `告警IP:`   11.11.11.11
> `实例别名:`  nbadb02
> `当前数值:` 2
> `告警级别:` 严重
> `触发时间:` 2021-03-16 14:11:59
> `持续时间:` 10s
> `告警次数:` 1
> `告警应用:`nba_oracle
> `应用负责人:` 

处理过程

  • 登录11.11.11.11进入oracle,并设置列宽(如果使用plsql则不用设置)
[bzops@nbadb02 ~]$ sudo -i
[root@nbadb02 ~]# su - oracle
[oracle@nbadb02 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Tue Mar 2 15:05:36 2021

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> col sample_time for a30
col event for a30
col program for a30
col machine for a40
set lines 200 pages 999
  • 查看当前的活动事务,发现会话60551573是14:07就已经开始了,到现在还没有结束,
    同时last_call_et是0,说明会话还在不停的执行sql中,并没有空闲
    其中6055占用的undo已经达到了11M,说明事务比较大
SQL> select a.inst_id inst,A.sid,a.serial#,a.blocking_session blk,a.last_call_et call,a.status,a.sql_id,a.PLSQL_ENTRY_OBJECT_ID plsql,PREV_SQL_ID,b.start_date,a.event,b.USED_UBLK*8/1024 m,a.seconds_in_wait wait,a.osuser,a.program,a.logon_time,b.xid,a.machine
 from gv$session a,gv$transaction b where a.taddr=b.addr and b.addr is not null and a.inst_id=b.inst_id order by 5 desc;

    INST    SID SERIAL# BLK CALL    STATUS  SQL_ID  PLSQL   PREV_SQL_ID START_DATE  EVENT   M   WAIT    OSUSER  PROGRAM LOGON_TIME  XID MACHINE
1   2   1545    7417        0   ACTIVE  crwj8kgsbsghz       crwj8kgsbsghz   2021/3/16 14:14:59  SQL*Net message to client   0.046875    0   nbauser JDBC Thin Client    2021/3/16 14:11:16  58001500481BE003    nba-service-daemon-58f676dcf6-bk8rp
2   2   6059    34681       0   INACTIVE            2xuyhc81va9jk   2021/3/16 14:11:49  SQL*Net message from client 0.8984375   0   nbauser JDBC Thin Client    2021/3/16 14:03:07  F6000400896A0A01    nba-service-mq-6c654b54df-dzw62
3   2   6055    46555       0   INACTIVE            dxupr3kq3k911   2021/3/16 14:07:10  SQL*Net message from client 11.4609375  0   nbauser JDBC Thin Client    2021/3/16 14:06:26  3F001C004161E604    nba-service-web-64cd6dcc69-w45f2
4   2   1573    24181       0   INACTIVE            2uy2z5g8ky2ab   2021/3/16 14:07:01  SQL*Net message from client 1.578125    0   nbauser JDBC Thin Client    2021/3/16 13:07:09  D3000D00D01C6801    nba-service-mq-6c654b54df-pqgp6
  • 查看会话历史,可以看到很多会话都被6055阻塞了,印证了上面查的6055的事务比较大
SQL> select sample_time,session_id as sid,sql_id,event,plsql_entry_object_id as plsql,sql_plan_hash_value as plan,sql_plan_line_id as line,sql_exec_id as execid,blocking_session blk,program as program,xid,user_id ,machine from gv$active_session_history 
where blocking_session is not null ;

    SAMPLE_TIME SID SQL_ID  EVENT   PLSQL   PLAN    LINE    EXECID  BLK PROGRAM XID USER_ID MACHINE
1   16-MAR-21 02.14.26.440 PM   1731    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49065086    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-w45f2
2   16-MAR-21 02.14.26.440 PM   3742    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064614    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-dtkz6
3   16-MAR-21 02.14.26.440 PM   1773    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064803    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-dtkz6
4   16-MAR-21 02.14.26.440 PM   6108    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064714    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-64v4t
5   16-MAR-21 02.14.26.440 PM   7261    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49065133    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-5h6zf
6   16-MAR-21 02.14.26.440 PM   9541    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064892    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-wcctn
7   16-MAR-21 02.14.26.440 PM   10690   08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064975    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-64v4t
8   16-MAR-21 02.14.26.440 PM   11724   08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49065049    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-76hs8
9   16-MAR-21 02.14.25.440 PM   1773    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064803    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-dtkz6
10  16-MAR-21 02.14.25.440 PM   1731    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49065086    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-w45f2
11  16-MAR-21 02.14.25.440 PM   3742    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064614    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-dtkz6
12  16-MAR-21 02.14.25.440 PM   6108    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064714    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-64v4t
13  16-MAR-21 02.14.25.440 PM   7261    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49065133    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-5h6zf
14  16-MAR-21 02.14.25.440 PM   9541    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064892    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-wcctn
15  16-MAR-21 02.14.25.440 PM   10690   08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064975    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-64v4t
16  16-MAR-21 02.14.25.440 PM   11724   08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49065049    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-76hs8
17  16-MAR-21 02.14.24.430 PM   9541    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064892    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-wcctn
18  16-MAR-21 02.14.24.430 PM   11724   08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49065049    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-76hs8
19  16-MAR-21 02.14.24.430 PM   10690   08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064975    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-64v4t
20  16-MAR-21 02.14.24.430 PM   7261    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49065133    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-5h6zf
21  16-MAR-21 02.14.24.430 PM   6108    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064714    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-64v4t
22  16-MAR-21 02.14.24.430 PM   3742    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064614    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-dtkz6
23  16-MAR-21 02.14.24.430 PM   1773    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064803    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-dtkz6
24  16-MAR-21 02.14.24.430 PM   1731    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49065086    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-w45f2
25  16-MAR-21 02.14.23.430 PM   10690   08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064975    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-64v4t
26  16-MAR-21 02.14.23.430 PM   1731    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49065086    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-w45f2
27  16-MAR-21 02.14.23.430 PM   11724   08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49065049    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-76hs8
28  16-MAR-21 02.14.23.430 PM   1773    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064803    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-dtkz6
29  16-MAR-21 02.14.23.430 PM   3742    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064614    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-dtkz6
30  16-MAR-21 02.14.23.430 PM   6108    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064714    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-64v4t
31  16-MAR-21 02.14.23.430 PM   7261    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49065133    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-5h6zf
32  16-MAR-21 02.14.23.430 PM   9541    08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064892    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-wcctn
33  16-MAR-21 02.14.22.400 PM   10690   08qkr18cnk9rk   enq: TX - row lock contention       585082728   1   49064975    6055    JDBC Thin Client        101 nba-service-web-64cd6dcc69-64v4t
................
  • 既然知道了阻塞者是6055,同时也不仅6055一个会话到现在没结束,会话也不是空闲中,
    接下来只能找出6055正在跑的sql发给开发评估这个事务是不是正常
    oracle中可以通过sql_trace来记录会话一段时间内都跑的什么sql
    (如果sql执行时间大于一秒,其实通过活动会话历史v$active_session_history也可以定位,但是6055last_call_et是0,说明执行的sql都是小于一秒的)
SQL> exec dbms_system.set_sql_trace_in_session(6055,46555,true);

PL/SQL procedure successfully completed.
间隔1分钟
SQL>  exec dbms_system.set_sql_trace_in_session(6055,46555,false);

PL/SQL procedure successfully completed.
  • 收集好sql_trace后,进入oracle的trace目录,按照时间排序可以看到最新的ora_开头的日志
-rw-r----- 1 oracle asmadmin  74K Mar 16 14:15 nba2_ora_149894.trm
-rw-r----- 1 oracle asmadmin 2.3M Mar 16 14:15 nba2_ora_149894.trc
  • 通过tkprof工具格式化一下
[oracle@nbadb02 trace]$ tkprof nba2_ora_149894.trc
output = bk.log

TKPROF: Release 11.2.0.4.0 - Development on Tue Mar 16 14:16:08 2021

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

[oracle@nbadb02 trace]$ sz bk.log
  • 过滤掉bk.log中的sys用户的递归sql,可以看到6055执行的sql逻辑
SQL ID: dzns8cts4u1a4 Plan Hash: 383260619

update tab1 set col1=:1 , col2=:2 , col3=
  :3 , col4=:4 , col5=:5 , col6=:6 , col7=:7 , col8=:8 , 
  col9=:9 , col10=:10 , col11=:11 , col12=:12 , col13=:13  
where
 ID=:14  and VERSION=:15 

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse     1878      0.03       0.05          0          0          0           0
Execute   1878      0.28       0.46        202       9475      17465        1878
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     3756      0.31       0.51        202       9475      17465        1878
....
SQL ID: dxupr3kq3k911 Plan Hash: 743517995

select sysinterfa0_.ID as ID400_, sysinterfa0_.BRAND_ID as BRAND2_400_, 
  sysinterfa0_.CONTENT as CONTENT400_, sysinterfa0_.CREATE_TIME as 
  CREATE4_400_, sysinterfa0_.ERROR_COUNT as ERROR5_400_, sysinterfa0_.OP_TIME 
  as OP6_400_, sysinterfa0_.REMARK as REMARK400_, sysinterfa0_.SHOP_ID as 
  SHOP8_400_, sysinterfa0_.SLIP_CODE as SLIP9_400_, sysinterfa0_.SOURCE_SYS 
  as SOURCE10_400_, sysinterfa0_.STATUS as STATUS400_, 
  sysinterfa0_.TARGET_SYS as TARGET12_400_, sysinterfa0_.TYPE as TYPE400_, 
  sysinterfa0_.VERSION as VERSION400_ 
from
 tab2 sysinterfa0_ where sysinterfa0_.SLIP_CODE=:1  and 
  sysinterfa0_.TYPE=:2  and sysinterfa0_.SOURCE_SYS=:3  and 
  sysinterfa0_.TARGET_SYS=:4  and sysinterfa0_.BRAND_ID=:5 

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse     1913      0.03       0.04          0          0          0           0
Execute   1913      0.02       0.03          0          0          0           0
Fetch     1913      0.04       0.09         55       9854          0        1878
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     5739      0.10       0.17         55       9854          0        1878
....
                        select shop.id
                            from tab1 sku,
                                 tab2 pro,
                                 tab3 br,
                                 tab4 shop
                            where sku.product_id = pro.id
                                 and  pro.brand_id =  br.brand_id
                                 and  br.shop_id =  shop.id
                                 and  sku.code = :1 
                                 and  shop.IS_SYNC_BMS = 1

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse      382      0.00       0.00          0          0          0           0
Execute    382      0.00       0.00          0          0          0           0
Fetch      382      0.11       0.14          1      15280          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     1146      0.12       0.16          1      15280          0           0
.....

解决办法

  • 把以上sql发给开发分析后,反馈这个事务逻辑有优化空间,现在在做批量库存的操作,有阻塞可以忽略,未来会优化逻辑
上一篇:R语言中的SOM(自组织映射神经网络)对NBA球员聚类分析


下一篇:【开发】【调试】【whistle】