oradebug故障收集方法

一、常用收集方法

开2个窗口，一个sqlplus / as sysdba，另一个数据库普通用户 HR

查询数据库用户所对应的spid

SQL> select spid

2 from v$process

3 where addr=(select paddr from v$session where username='HR');

SPID

------------------------

2104

跟踪系统进程

SQL> oradebug setospid 2104;

Oracle pid: 30, Unix process pid: 2104, image: oracle@Oracle11g (TNS V1-V3)

取消trace文件大小限制

oradebug unlimit

启用会话级10046

SQL> alter session set events '10046 trace name context forever,level 12';

Session altered.

或者oradebug event 10046 trace name context forever,level 4;

关闭10046事件

SQL> oradebug event 10046 trace name context off;

Statement processed;

查看tracefile_name文件名及路径

SQL> oradebug tracefile_name

/u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2104.trc

二、数据库 hang问题的诊断信息收集方法

Hanganalyze and Systemstate Dumps信息概括

Hanganalyze 和 Systemstate Dumps 提供了在某一个特定的时间点上，数据库里边进程的信息。Hanganalyze 提供了卷入到hang chain中的所有的process信息，而systemstate 提供了数据库里边所有的process的信息。
当查找一个潜在的hang situation的时候，你需要确定一个process是否被卡住（stuck）或者缓慢的移动。通过连续收集两个间隔的dump，这个可以被看出来。
如果一个process 被卡住了，这些traces会提供信息以启动更进一步的诊断，并帮助提供解决方案。
Hanganalyze 是一个概括，将会确认（confirm）db是否是真的hang住，还是只是慢（slow），并提供一个连续的snapshot
Systemstate dump 显示了数据库中的process正在干些什么。

收集Hanganalyze and Systemstate Dumps

登陆进入系统：
使用下列命令登陆：

sqlplus / as sysdba

若是如上方式连接数据库有问题，那么可以使用 sqlplus preliminary connection方式：

[oracle@Oracle11g ~]$ sqlplus -prelim / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Mon Feb 11 18:47:19 2019

SQL>

非rac时， Hanganalyze and Systemstate的收集命令

Hanganalyze

sqlplus / as sysdba

oradebug setmypid

oradebug unlimit

oradebug hanganalyze 3

----等一分钟左右再进行第二个hanganalysis

oradebug hanganalyze 3

oradebug tracefile_name

exit

Systemstate

sqlplus / as sysdba

oradebug setmypid

oradebug unlimit

oradebug dump systemstate 266

oradebug tracefile_name

exit

三份新的trace文件会在user_dump_destination下生成，其中一份包含两次Hanganalyze的dump，另外两份各包含一次systemstate dump

rac时， Hanganalyze and Systemstate的收集命令

若是没有打相关的patch，会有两个影响rac的bug，make using level 266 or 267 very costly。因此，没有打这两个补丁的话，不推荐使用这些level

有关这些补丁的信息，请见：

Document 11800959.8 Bug 11800959 - A SYSTEMSTATE dump with level >= 10 in RAC dumps huge BUSY GLOBAL CACHE ELEMENTS - can hang/crash instances

Document 11827088.8 Bug 11827088 - Latch 'gc element' contention, LMHB terminates the instance

注意: 这两个bug 在11.2.0.3中被fix掉了。

当bug 11800959 和 bug 11827088被fix掉时，收集rac的Hanganalyze and Systemstate:

For 11g:

sqlplus / as sysdba

oradebug setorapname reco

oradebug unlimit

oradebug -g all hanganalyze 3

oradebug -g all dump systemstate 266

--等待两分钟后再次执行

oradebug -g all hanganalyze 3

oradebug -g all dump systemstate 266

exit

当bug 11800959 和 bug 11827088没有被fix掉时，收集rac的Hanganalyze and Systemstate:

sqlplus / as sysdba

oradebug setorapname reco

oradebug unlimit

oradebug -g all hanganalyze 3

oradebug -g all dump systemstate 258

---等待两分钟后再次执行

oradebug -g all hanganalyze 3

oradebug -g all dump systemstate 258

exit

对于10g，用 oradebug setmypid 来替代oradebug setorapname reco

sqlplus / as sysdba

oradebug setmypid

oradebug unlimit

oradebug -g all hanganalyze 3

oradebug -g all dump systemstate 258

--等待两分钟后再次执行

oradebug -g all hanganalyze 3

oradebug -g all dump systemstate 258

exit

trace 文件会被写入到diag后台进程的trace文件中，在每个节点的backgroud_dump_destination目录中

故障情况收集oradebug的手册

oradebug故障收集方法

一、常用收集方法

查询数据库用户所对应的spid

跟踪系统进程

取消trace文件大小限制

启用会话级10046

关闭10046事件

查看tracefile_name文件名及路径

二、数据库 hang问题的诊断信息收集方法

Hanganalyze and Systemstate Dumps信息概括

非rac时， Hanganalyze and Systemstate的收集命令

rac时， Hanganalyze and Systemstate的收集命令

猜你喜欢