oracle故障脚本收集

1、.万能重启方法
如应急情况,需要重启数据库:
tail -100f <对应路径>alert_fgedu.log
alter system switch logfile;
alter system checkpoint;
shutdown immediate;
如果不能正常关机,可以使用shutdown abort强制关机;
startup

2.操作系统性能(通常故障出现时最先检查的内容)
top、topas、vmstat、iostat、free、nmon

3.批量杀进程(数据库挂起时应急恢复)
3.1.kill所有LOCAL=NO进程
ps -ef|grep LOCAL=NO|grep $ORACLE_SID|grep -v grep|awk '{print $2}' |xargs kill -9
3.2.按用户批量杀进程
select 'alter system kill session ''' || s.sid || ',' || s.serial# ||

   '''; -- kill -9  ' || p.spid

from v$session s, v$process p
where s.PADDR = p.addr and s.username='&username'

4.数据库杀会话(应急方法)
4.1.杀某个SID会话
SELECT /+ rule / sid, s.serial#, 'kill -9 '||spid, event, blocking_session b_sess
FROM v$session s, v$process p WHERE sid='&sid' AND s.paddr = p.addr order by 1;
4.2.根据SQL_ID杀会话
SELECT /+ rule / sid, s.serial#, 'kill -9 '||spid, event, blocking_session b_sess
FROM v$session s, v$process p WHERE sql_id='&sql_id' AND s.paddr = p.addr order by 1;
4.3.根据等待事件杀会话
SELECT /+ rule / sid, s.serial#, 'kill -9 '||spid, event, blocking_session b_sess
FROM v$session s, v$process p WHERE event='&event' AND s.paddr = p.addr order by 1;
4.4.根据用户杀会话
SELECT /+ rule / sid, s.serial#, 'kill -9 '||spid, event, blocking_session b_sess
FROM v$session s, v$process p WHERE username='&username' AND s.paddr = p.addr order by 1;

**5.性能报告收集与自动诊断报告(性能分析必备)
5.1.statspack (提示:适合于9i以下版本)**
spcreate.sql, execute statspack.snap
spreport.sql spdrop.sql
5.2.awr性能监控工具的使用方法(提示:10g/11g/12c/18c/19c使用)
性能报告产生方法(支持txt和html格式):
@$ORACLE_HOME/rdbms/admin/awrrpt.sql
或者
--RAC可以指定实例id
@$ORACLE_HOME/rdbms/admin/awrrpti.sql
5.3. addm自动故障诊断报告(提示:10g/11g/12c/18c/19c使用)
@$ORACLE_HOME/rdbms/admin/addmrpt.sql
或者
--RAC可以指定实例id
@$ORACLE_HOME/rdbms/admin/addmrpti.sql

6.定期检查表空间使用情况(表空间100%导致业务异常)
--from:www.fgedu.net.cn/oracle.html
col f.tablespace_name format a15
col d.tot_grootte_mb format a10
col ts-per format a8
select upper(f.tablespace_name) "TS-name",

   d.tot_grootte_mb "TS-bytes(m)",
   d.tot_grootte_mb - f.total_bytes "TS-used (m)",
   f.total_bytes "TS-free(m)",
   to_char(round((d.tot_grootte_mb - f.total_bytes) / d.tot_grootte_mb * 100,
                 2),
           '990.99') "TS-per"
     from (select tablespace_name,
           round(sum(bytes) / (1024 * 1024), 2) total_bytes,
           round(max(bytes) / (1024 * 1024), 2) max_bytes
      from sys.dba_free_space
     group by tablespace_name) f,
   (select dd.tablespace_name,
           round(sum(dd.bytes) / (1024 * 1024), 2) tot_grootte_mb
      from sys.dba_data_files dd
     group by dd.tablespace_name) d

where d.tablespace_name = f.tablespace_name
order by 5 desc;

7.捕获占用CPU利用率过高的SQL语句
set lin 1000
set pagesize 1000
col USERNAME format a16
col MACHINE format a16
col SQL_TEXT format a200
SELECT a.username,a.machine,a.program,a.sid,a.serial#,a.status,c.piece,c.sql_text FROM v$session a,v$process b,v$sqltext c WHERE b.spid='&spid' AND b.addr=a.paddr AND a.sql_address=c.address(+) ORDER BY c.piece;

8.查看等待事件(在数据库中首先要检查的操作)
col event for a45
SELECT inst_id,EVENT, SUM(DECODE(WAIT_TIME, 0, 0, 1)) "Prev", SUM(DECODE(WAIT_TIME, 0, 1, 0)) "Curr", COUNT(*) "Tot" , sum(SECONDS_IN_WAIT) SECONDS_IN_WAIT
FROM GV$SESSION_WAIT
WHERE event NOT
IN ('smon timer','pmon timer','rdbms ipc message','SQL*Net message from client','gcs remote message')

AND event NOT LIKE '%idle%'
AND event NOT LIKE '%Idle%'
AND event NOT LIKE '%Streams AQ%'

GROUP BY inst_id,EVENT
ORDER BY 1,5 desc
提示:数据库中有一些常见异常等待事件,要重点分析,如:row cache lock、buffer busy waits、library cache lock、read by other session、latch:shared pool、gc buffer busy、cursor: pin S on X、direct path read、log file sync、enq: TX - index contention、latch free、enq: TX - row lock contention等等。

9.根据等待事件查会话
得到异常等待事件之后,我们就根据等待事件去查会话详情,也就是查看哪些会话执行哪些SQL在等待,另外还查出来用户名和机器名称,以及是否被阻塞。
SELECT /+rule / sid, s.serial#, spid, event, sql_id, seconds_in_wait ws, row_wait_obj# obj,
s.username, s.machine, BLOCKING_INSTANCE||'.'||blocking_session b_sess
FROM v$session s, v$process p
WHERE event='&event_name' AND s.paddr = p.addr order by 6;
10.查询某个会话详情
得到会话列表之后,可以根据如下SQL查询某个会话的详细信息,如上次个执行的SQL_ID,登录时间等。
SELECT s.sid, s.serial#, spid, event, sql_id, PREV_SQL_ID, seconds_in_wait ws, row_wait_obj# obj,
s.username, s.machine, module,blocking_session b_sess,logon_time
FROM v$session s, v$process p
WHERE sid = '&sid' AND s.paddr = p.addr
11.查询对象信息
从前面两个SQL都可以看到会话等待的对象ID,可以通过如下SQL查询对象的详细信息。
col OBJECT_NAME for a30
select owner,object_name,subobject_name,object_type
from dba_objects
where object_id=&oid
12.根据SQL_ID、HASH_VALUE查询SQL语句
select sql_id,SQL_fullTEXT
from v$sqlarea
where (sql_id='&sqlid' or hash_value=to_number('&hashvale') )
and rownum<2
13..查询会话阻塞情况,某个会话阻塞了多少个会话
select count(*),blocking_session
from v$session
where blocking_session is not null
group by blocking_session;
**14.查询数据库的锁
通过如下SQL查询某个会话的锁,有哪些TM、TX锁,以及会话和锁关联查询的SQL。**
set linesize 180
col username for a15
col owner for a15
col OBJECT_NAME for a30
col SPID for a10
14.1.查询某个会话的锁
select /+rule/SESSION_ID,OBJECT_ID,ORACLE_USERNAME,OS_USER_NAME,PROCESS,LOCKED_MODE
from gv$locked_object where session_id=&sid;
14.2.查询TM、TX锁
select /+rule/* from v$lock
where ctime >100 and type in ('TX','TM') order by 3,9;
14.3.查询数据库中的锁
select /+rule/s.sid,p.spid,l.type,round(max(l.ctime)/60,0) lock_min,s.sql_id,s.USERNAME,b.owner,b.object_type,b.object_name
from v$session s, v$process p,v$lock l,v$locked_object o,dba_objects b
where o.SESSION_ID=s.sid and s.sid=l.sid and o.OBJECT_ID=b.OBJECT_ID
and s.paddr = p.addr and l.ctime >100 and l.type in ('TX','TM','FB')
group by s.sid,p.spid,l.type,s.sql_id,s.USERNAME,b.owner,b.object_type,b.object_name
order by 9,1,3
15.故障信息收集
提示:数据库hang住了之后,需要详细分析原因,或者提供给二线支持的信息,可使用下面脚本,收集systemstate dump和hanganalyze信息,如果有sqlplus无法登陆的情况,可以加-prelim参数。
--systemstate dump
sqlplus -prelim / as sysdba
oradebug setmypid
oradebug unlimit;
oradebug dump systemstate 266;
--wait for 1 min
oradebug dump systemstate 266;
--wait for 1 min
oradebug dump systemstate 266;
oradebug tracefile_name;
--hanganalyze
oradebug setmypid
oradebug unlimit;
oradebug dump hanganalyze 3
--wait for 1 min
oradebug dump hanganalyze 3
--wait for 1 min
oradebug dump hanganalyze 3
oradebug tracefile_name

上一篇:Android-来填写一个验证码吧!(一)


下一篇:postgresql 重要参数汇总