oracle如何处理行锁
16lz
2021-03-16
问题现象
##### 统一监控告警平台-DB: [触发]> `告警集群:` jq-prod-db-prometheus> `告警描述:` 数据库Oracle堵塞会话超过5分钟,会话ID:6055> `告警环境:` prod> `告警类型:` oracle_prod> `告警实例:` nbadb02> `告警IP:` 11.11.11.11> `实例别名:` nbadb02> `当前数值:` 2> `告警级别:` 严重> `触发时间:` 2021-03-16 14:11:59> `持续时间:` 10s> `告警次数:` 1> `告警应用:`nba_oracle> `应用负责人:`
处理过程
- 登录
11.11.11.11
进入oracle,并设置列宽(如果使用plsql则不用设置)
[bzops@nbadb02 ~]$ sudo -i[root@nbadb02 ~]# su - oracle[oracle@nbadb02 ~]$ sqlplus / as sysdbaSQL*Plus: Release 11.2.0.4.0 Production on Tue Mar 2 15:05:36 2021Copyright (c) 1982, 2013, Oracle. All rights reserved.Connected to:Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit ProductionWith the Partitioning, OLAP, Data Mining and Real Application Testing optionsSQL>SQL> col sample_time for a30col event for a30col program for a30col machine for a40set lines 200 pages 999
- 查看当前的活动事务,发现会话
6055
和1573
是14:07就已经开始了,到现在还没有结束,
同时last_call_et
是0,说明会话还在不停的执行sql中,并没有空闲
其中6055
占用的undo已经达到了11M,说明事务比较大
SQL> select a.inst_id inst,A.sid,a.serial#,a.blocking_session blk,a.last_call_et call,a.status,a.sql_id,a.PLSQL_ENTRY_OBJECT_ID plsql,PREV_SQL_ID,b.start_date,a.event,b.USED_UBLK*8/1024 m,a.seconds_in_wait wait,a.osuser,a.program,a.logon_time,b.xid,a.machine from gv$session a,gv$transaction b where a.taddr=b.addr and b.addr is not null and a.inst_id=b.inst_id order by 5 desc; INST SID SERIAL# BLK CALL STATUS SQL_ID PLSQL PREV_SQL_ID START_DATE EVENT M WAIT OSUSER PROGRAM LOGON_TIME XID MACHINE1 2 1545 7417 0 ACTIVE crwj8kgsbsghz crwj8kgsbsghz 2021/3/16 14:14:59 SQL*Net message to client 0.046875 0 nbauser JDBC Thin Client 2021/3/16 14:11:16 58001500481BE003 nba-service-daemon-58f676dcf6-bk8rp2 2 6059 34681 0 INACTIVE 2xuyhc81va9jk 2021/3/16 14:11:49 SQL*Net message from client 0.8984375 0 nbauser JDBC Thin Client 2021/3/16 14:03:07 F6000400896A0A01 nba-service-mq-6c654b54df-dzw623 2 6055 46555 0 INACTIVE dxupr3kq3k911 2021/3/16 14:07:10 SQL*Net message from client 11.4609375 0 nbauser JDBC Thin Client 2021/3/16 14:06:26 3F001C004161E604 nba-service-web-64cd6dcc69-w45f24 2 1573 24181 0 INACTIVE 2uy2z5g8ky2ab 2021/3/16 14:07:01 SQL*Net message from client 1.578125 0 nbauser JDBC Thin Client 2021/3/16 13:07:09 D3000D00D01C6801 nba-service-mq-6c654b54df-pqgp6
- 查看会话历史,可以看到很多会话都被
6055
阻塞了,印证了上面查的6055
的事务比较大
SQL> select sample_time,session_id as sid,sql_id,event,plsql_entry_object_id as plsql,sql_plan_hash_value as plan,sql_plan_line_id as line,sql_exec_id as execid,blocking_session blk,program as program,xid,user_id ,machine from gv$active_session_history where blocking_session is not null ; SAMPLE_TIME SID SQL_ID EVENT PLSQL PLAN LINE EXECID BLK PROGRAM XID USER_ID MACHINE1 16-MAR-21 02.14.26.440 PM 1731 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065086 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-w45f22 16-MAR-21 02.14.26.440 PM 3742 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064614 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-dtkz63 16-MAR-21 02.14.26.440 PM 1773 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064803 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-dtkz64 16-MAR-21 02.14.26.440 PM 6108 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064714 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t5 16-MAR-21 02.14.26.440 PM 7261 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065133 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-5h6zf6 16-MAR-21 02.14.26.440 PM 9541 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064892 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-wcctn7 16-MAR-21 02.14.26.440 PM 10690 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064975 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t8 16-MAR-21 02.14.26.440 PM 11724 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065049 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-76hs89 16-MAR-21 02.14.25.440 PM 1773 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064803 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-dtkz610 16-MAR-21 02.14.25.440 PM 1731 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065086 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-w45f211 16-MAR-21 02.14.25.440 PM 3742 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064614 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-dtkz612 16-MAR-21 02.14.25.440 PM 6108 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064714 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t13 16-MAR-21 02.14.25.440 PM 7261 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065133 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-5h6zf14 16-MAR-21 02.14.25.440 PM 9541 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064892 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-wcctn15 16-MAR-21 02.14.25.440 PM 10690 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064975 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t16 16-MAR-21 02.14.25.440 PM 11724 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065049 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-76hs817 16-MAR-21 02.14.24.430 PM 9541 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064892 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-wcctn18 16-MAR-21 02.14.24.430 PM 11724 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065049 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-76hs819 16-MAR-21 02.14.24.430 PM 10690 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064975 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t20 16-MAR-21 02.14.24.430 PM 7261 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065133 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-5h6zf21 16-MAR-21 02.14.24.430 PM 6108 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064714 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t22 16-MAR-21 02.14.24.430 PM 3742 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064614 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-dtkz623 16-MAR-21 02.14.24.430 PM 1773 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064803 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-dtkz624 16-MAR-21 02.14.24.430 PM 1731 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065086 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-w45f225 16-MAR-21 02.14.23.430 PM 10690 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064975 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t26 16-MAR-21 02.14.23.430 PM 1731 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065086 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-w45f227 16-MAR-21 02.14.23.430 PM 11724 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065049 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-76hs828 16-MAR-21 02.14.23.430 PM 1773 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064803 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-dtkz629 16-MAR-21 02.14.23.430 PM 3742 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064614 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-dtkz630 16-MAR-21 02.14.23.430 PM 6108 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064714 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t31 16-MAR-21 02.14.23.430 PM 7261 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065133 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-5h6zf32 16-MAR-21 02.14.23.430 PM 9541 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064892 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-wcctn33 16-MAR-21 02.14.22.400 PM 10690 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064975 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t................
- 既然知道了阻塞者是
6055
,同时也不仅6055
一个会话到现在没结束,会话也不是空闲中,
接下来只能找出6055
正在跑的sql发给开发评估这个事务是不是正常
oracle中可以通过sql_trace来记录会话一段时间内都跑的什么sql
(如果sql执行时间大于一秒,其实通过活动会话历史v$active_session_history也可以定位,但是6055
的last_call_et
是0,说明执行的sql都是小于一秒的)
SQL> exec dbms_system.set_sql_trace_in_session(6055,46555,true);PL/SQL procedure successfully completed.间隔1分钟SQL> exec dbms_system.set_sql_trace_in_session(6055,46555,false);PL/SQL procedure successfully completed.
- 收集好sql_trace后,进入oracle的trace目录,按照时间排序可以看到最新的ora_开头的日志
-rw-r----- 1 oracle asmadmin 74K Mar 16 14:15 nba2_ora_149894.trm-rw-r----- 1 oracle asmadmin 2.3M Mar 16 14:15 nba2_ora_149894.trc
- 通过tkprof工具格式化一下
[oracle@nbadb02 trace]$ tkprof nba2_ora_149894.trcoutput = bk.logTKPROF: Release 11.2.0.4.0 - Development on Tue Mar 16 14:16:08 2021Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.[oracle@nbadb02 trace]$ sz bk.log
- 过滤掉bk.log中的sys用户的递归sql,可以看到
6055
执行的sql逻辑
SQL ID: dzns8cts4u1a4 Plan Hash: 383260619update tab1 set col1=:1 , col2=:2 , col3= :3 , col4=:4 , col5=:5 , col6=:6 , col7=:7 , col8=:8 , col9=:9 , col10=:10 , col11=:11 , col12=:12 , col13=:13 where ID=:14 and VERSION=:15 call count cpu elapsed disk query current rows------- ------ -------- ---------- ---------- ---------- ---------- ----------Parse 1878 0.03 0.05 0 0 0 0Execute 1878 0.28 0.46 202 9475 17465 1878Fetch 0 0.00 0.00 0 0 0 0------- ------ -------- ---------- ---------- ---------- ---------- ----------total 3756 0.31 0.51 202 9475 17465 1878....SQL ID: dxupr3kq3k911 Plan Hash: 743517995select sysinterfa0_.ID as ID400_, sysinterfa0_.BRAND_ID as BRAND2_400_, sysinterfa0_.CONTENT as CONTENT400_, sysinterfa0_.CREATE_TIME as CREATE4_400_, sysinterfa0_.ERROR_COUNT as ERROR5_400_, sysinterfa0_.OP_TIME as OP6_400_, sysinterfa0_.REMARK as REMARK400_, sysinterfa0_.SHOP_ID as SHOP8_400_, sysinterfa0_.SLIP_CODE as SLIP9_400_, sysinterfa0_.SOURCE_SYS as SOURCE10_400_, sysinterfa0_.STATUS as STATUS400_, sysinterfa0_.TARGET_SYS as TARGET12_400_, sysinterfa0_.TYPE as TYPE400_, sysinterfa0_.VERSION as VERSION400_ from tab2 sysinterfa0_ where sysinterfa0_.SLIP_CODE=:1 and sysinterfa0_.TYPE=:2 and sysinterfa0_.SOURCE_SYS=:3 and sysinterfa0_.TARGET_SYS=:4 and sysinterfa0_.BRAND_ID=:5 call count cpu elapsed disk query current rows------- ------ -------- ---------- ---------- ---------- ---------- ----------Parse 1913 0.03 0.04 0 0 0 0Execute 1913 0.02 0.03 0 0 0 0Fetch 1913 0.04 0.09 55 9854 0 1878------- ------ -------- ---------- ---------- ---------- ---------- ----------total 5739 0.10 0.17 55 9854 0 1878.... select shop.id from tab1 sku, tab2 pro, tab3 br, tab4 shop where sku.product_id = pro.id and pro.brand_id = br.brand_id and br.shop_id = shop.id and sku.code = :1 and shop.IS_SYNC_BMS = 1call count cpu elapsed disk query current rows------- ------ -------- ---------- ---------- ---------- ---------- ----------Parse 382 0.00 0.00 0 0 0 0Execute 382 0.00 0.00 0 0 0 0Fetch 382 0.11 0.14 1 15280 0 0------- ------ -------- ---------- ---------- ---------- ---------- ----------total 1146 0.12 0.16 1 15280 0 0.....
解决办法
- 把以上sql发给开发分析后,反馈这个事务逻辑有优化空间,现在在做批量库存的操作,有阻塞可以忽略,未来会优化逻辑
更多相关文章
- 企业微信接收 Amazon CloudWatch 告警
- Prometheus+Alertmanager配置邮件报警
- Zabbix监控ELK异常日志告警
- java使用 Cookie 和 JWT 身份验证
- Nginx的Upstream监控及告警
- 初试kafka监控及管理利器之kafka-eagle
- Prometheus配置企业微信告警
- 介绍一款好用的终端工具 Screen
- 第14章 0301-实战会话控制技术,(项目实战:主页登录、用户注册 + 整