hanganalyze以及ssd使用测试

2022-01-10 00:00:00 执行 操作 节点 会话 阻塞


********************************测试
如果数据库依然可以登录
Using SQL*Plus connect as SYSDBA using the following command:

sqlplus '/ as sysdba'
如果无法连接, 使用"preliminary connection" 连接 :

sqlplus -prelim '/ as sysdba'
注意这种连接存在问题
>>>>>>NOTE: From 11.2.0.2 onwards, hanganalyze will not produce output under a sqlplus "preliminary connection" since it requires a process state object and a session state object. If a hanganalyze is attempted, although the hanganalyze will appear to be successful:

SQL> oradebug hanganalyze 3
Statement processed.

>>>>>>>the tracefile will contain the following output:
>>>>>>>HANG ANALYSIS:
ERROR: Can not perform hang analysis dump without a process state object and a session state object.
( process=(nil), sess=(nil) )
>>>>>>>>As a workaround connect to an existing process id, for example PMON/SMON, before performing the dump :
sqlplus - prelim / as sysdba
oradebug setospid < use an existing process id, for example PMON/SMON>
oradebug hanganalyze 3
oradebug tracefile_name
>>>>>>>>>Hanganalyze

sqlplus '/ as sysdba'
oradebug setmypid
oradebug unlimit
oradebug hanganalyze 3
-- Wait one minute before getting the second hanganalyze
oradebug hanganalyze 3
oradebug tracefile_name
exit
>>>>>>>>>Systemstate

sqlplus '/ as sysdba'
oradebug setmypid
oradebug unlimit
oradebug dump systemstate 258
oradebug dump systemstate 258
oradebug tracefile_name
exit

>>>>>>同时产生hanganalyze和ssd (RAC环境)
For 10g (and above)
sqlplus '/ as sysdba'
oradebug setmypid
oradebug unlimit
oradebug -g all hanganalyze 3
oradebug -g all hanganalyze 3
oradebug -g all dump systemstate 258
oradebug -g all dump systemstate 258
oradebug tracefile_name
exit


通过v$session查等待链

col level1 for a10;
select lpad('-',level,'-')||'>'
level1,inst_id,sid,serial#,event,blocking_session,seconds_in_wait
from gv$session
start with event like 'enq%' connect by sid = prior blocking_session;

测试
节点1 会话执行如下操作
SQL> update lin.test set object_name='TEST' where object_id=3000;

1 row updated.
节点2 会话执行如下操作
SQL> delete from lin.test where object_id=3000;

此时节点1操作阻塞节点2操作.

我们看下阻塞链
SQL> select lpad('-',level,'-')||'>'
level1,inst_id,sid,serial#,event,blocking_session,seconds_in_wait
from gv$session
start with event like 'enq%' connect by sid = prior blocking_session

LEVEL1 INST_ID SID SERIAL# EVENT BLOCKING_SESSION SECONDS_IN_WAIT
---------- ---------- ---------- ---------- ------------------------------ ---------------- ---------------
-> 2 27 22329 enq: TX - row lock contention 268 413
--> 2 268 65076 SQL*Net message from client 459
--> 1 268 59575 GCR sleep 246129
268会话阻塞了会话27,会话27在等待TX行锁, 而会话268什么也没做,空等待,下面从hanganalyze看看这个阻塞关系.

SQL> oradebug setmypid
Statement processed.
SQL> oradebug unlimit
Statement processed.
SQL> oradebug -g all hanganalyze 3
Hang Analysis in /oracle/db/diag/rdbms/prod/prod1/trace/prod1_diag_25774.trc
SQL> oradebug -g all hanganalyze 3
Hang Analysis in /oracle/db/diag/rdbms/prod/prod1/trace/prod1_diag_25774.trc
SQL> oradebug tracefile_name
/oracle/db/diag/rdbms/prod/prod1/trace/prod1_ora_16893.trc
SQL> exit
Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

cat /oracle/db/diag/rdbms/prod/prod1/trace/prod1_diag_25774.trc
下面是有阻塞链的信息,注意被阻塞会话有SQL,而造成阻塞的会话因为SQL执行完了,所以在current sql部分没有展示,如何抓取可以
参考我之前的文章从log miner抓取.这里很清楚的说明来自实例prod1的会话27 正在执行delete,它被实例prod2的会话268阻塞
*** 2022-01-10T10:25:25.178191+08:00 (CDB$ROOT(1))
===============================================================================
HANG ANALYSIS:
instances (db_name.oracle_sid): prod.prod1, prod.prod2
oradebug_node_dump_level: 3
analysis initiated by oradebug
os thread scheduling delay history: (sampling every 1.000000 secs)
0.000000 secs at [ 10:25:25 ]
NOTE: scheduling delay has not been sampled for 0.056720 secs
0.000000 secs from [ 10:25:21 - 10:25:26 ], 5 sec avg
0.000000 secs from [ 10:24:25 - 10:25:26 ], 1 min avg
0.000007 secs from [ 10:20:25 - 10:25:26 ], 5 min avg
vktm time drift history
===============================================================================

Chains most likely to have caused the hang:
[a] Chain 1 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
Chain 1 Signature Hash: 0x38c48850
[b] Chain 2 Signature: 'REPL Capture/Apply: RAC AQ qmn coordinator'
Chain 2 Signature Hash: 0x38f1e28b
[c] Chain 3 Signature: 'REPL Capture/Apply: RAC AQ qmn coordinator'
Chain 3 Signature Hash: 0x38f1e28b

===============================================================================
Non-intersecting chains:

-------------------------------------------------------------------------------
Chain 1:
-------------------------------------------------------------------------------
Oracle session identified by:
{
instance: 2 (prod.prod2)
os id: 3541
process id: 64, oracle@rac2
session id: 27
session serial #: 22329
module name: 3 (SQL*Plusrac2 (TNS V1-V3))
pdb id: 3 (PDB)
}
is waiting for 'enq: TX - row lock contention' with wait info:
{
p1: 'name|mode'=0x54580006
p2: 'usn<<16 | slot'=0x50003
p3: 'sequence'=0x31e
px1: 'pdb uid'=0xafcfbdd1
px2: 'master hint'=0x2
time in wait: 56.639076 sec
timeout after: never
wait id: 44
blocking: 0 sessions
current sql_id: 2674808191
current sql: delete from lin.test where object_id=3000
short stack: ksedsts()+426<-ksdxfstk()+58<-ksdxcb()+872<-sspuser()+200<-__sighandler()<-semtimedop()+10<-skgpwwait()+187<-ksliwat()
+2218<-kslwaitctx()+188<-kjusuc()+4689<-ksipgetctxia()+2081<-ksqcmi()+2757<-ksqgtlctx()+6620<-ksqgelctx()+838<-ktuGetTxForXid()+230<-ktcwit1()+3
67<-kdddgb()+5478<-kdddel()+508<-kaudel()+95<-delrow()+1480<-qerdlDelRow()+508<-qerdlRopRowsets()+434<-kdstf110110100001000km()+12357<-kdsttgr()
+2160<-qertbFetch()+1089<-qerdlFetch()+1376<-delexe()+1179<-opiexe()+11716<-kpoal8()+2387<-opiodr()+1202<-ttcpip(
wait history:
* time between current wait and wait #1: 0.003351 sec
1. event: 'SQL*Net message from client'
time waited: 18.832907 sec
wait id: 43 p1: 'driver id'=0x74637000
p2: '#bytes'=0x1
* time between wait #1 and #2: 0.000024 sec
2. event: 'SQL*Net message to client'
time waited: 0.000002 sec
wait id: 42 p1: 'driver id'=0x74637000
p2: '#bytes'=0x1
* time between wait #2 and #3: 0.000006 sec
3. event: 'SQL*Net break/reset to client'
time waited: 0.000273 sec
wait id: 41 p1: 'driver id'=0x74637000
p2: 'break?'=0x0
}
and is blocked by
=> Oracle session identified by:
{
instance: 2 (prod.prod2)
os id: 3257
process id: 58, oracle@rac2
session id: 268
session serial #: 65076
module name: 3 (SQL*Plusrac1 (TNS V1-V3))
pdb id: 3 (PDB)
}
which is waiting for 'SQL*Net message from client' with wait info:
{
p1: 'driver id'=0x74637000
p2: '#bytes'=0x1
time in wait: 1 min 42 sec
timeout after: never
wait id: 92
blocking: 1 session
current sql_id: 0
current sql: <none>
short stack: ksedsts()+426<-ksdxfstk()+58<-ksdxcb()+872<-sspuser()+200<-__sighandler()<-read()+14<-snttread()+16<-nttfprd()+354<-ns
basic_brc()+399<-nioqrc()+438<-opikndf2()+999<-opitsk()+905<-opiino()+936<-opiodr()+1202<-opidrv()+1094<-sou2o()+165<-opimai_real()+422<-ssthrdm
ain()+417<-main()+256<-__libc_start_main()+245
wait history:
* time between current wait and wait #1: 0.000007 sec
1. event: 'SQL*Net message to client'
time waited: 0.000002 sec
wait id: 91 p1: 'driver id'=0x74637000
p2: '#bytes'=0x1
* time between wait #1 and #2: 0.000009 sec
2. event: 'PGA memory operation'
time waited: 0.000062 sec
wait id: 90
* time between wait #2 and #3: 0.010887 sec
3. event: 'db file sequential read'
time waited: 0.038361 sec
wait id: 89 p1: 'file#'=0xe
p2: 'block#'=0x82
p3: 'blocks'=0x1
}

Chain 1 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
Chain 1 Signature Hash: 0x38c48850
-------------------------------------------------------------------------------


相关文章