Hung Operations
If database operations are 'hanging' it is difficult to tell
what is happening or what to do about it.
This article helps DBA's to identify whether the process is hanging (where
nothing is moving) or spinning (when a process gets into a (possibly
infinite) loop).
Diagnostics
-----------
If an operation is taking significantly more time than expected or is
compromising the performance of other operations, then the best place
to check is v$session_wait. This view shows information about what each
session in the system is waiting for at the current moment in time.
The following SQL*Plus script gathers and formats the required information:
column sid format 990
column seq# format 99990
column wait_time heading 'WTime' format 99990
column event format a30
column p1 format 9999999990
column p2 format 9999999990
column p3 format 9990
select sid,event,seq#,p1,p2,p3,wait_time
from V$session_wait
order by sid
/
This select should be repeated at least 3 times and the results compared.
Column meanings:
sid System IDentifier of the session
seq# Sequence number. This increments each time a new event is waited
for by a particular session. It can be used to tell if a session is
moving along or not.
event Operation that the session is waiting for or last waited for.
p1 p2 p3 These columns have different meanings for different event values.
wait_time Zero values indicate that the session is waiting for the event.
Non-zero values indicate that this was the last event that the
session waited for and that the session is currently using cpu.
Sample output:
SID EVENT SEQ# P1 P2 P3 WTime
---- ------------------------------ ------ ----------- ----------- ----- ------
1 pmon timer 335 300 0 0 0
2 rdbms ipc message 779 300 0 0 0
6 smon timer 74 300 0 0 0
9 Null event 347 0 300 0 0
16 SQL*Net message from client 1064 1650815315 1 0 -1
If the above script reveals about an ENQUEUE wait then you will need to check
for any locks related to your hanging session:
column sid format 990
column type format a2
column id1 format 9999999990
column id2 format 9999999990
column Lmode format 990
column request format 990
select * from v$lock
/
Spinning
--------
In the case of a Spin situation the session events would normally be static
and the session would not be waiting for an event - rather it would be on CPU.
(note in rare circumstances, the event may or may not be static
depending on where in the code the spinning is taking place). It would be
expected that the session would be utilizing resources heavily such as CPU
and memory.
For a Spin situation it is important to determine which area of the code the
session is spinning in. Some indication of this may be derived from the event
however it is usually necessary to produce an errorstack of the process a
few times for analysis by support:
connect sys/sys as sysdba
oradebug setospid <SPID>
oradebug unlimit
oradebug dump errorstack 3
oradebug dump errorstack 3
oradebug dump errorstack 3
where SPID is the operating system process identifier, you can get it from
v$process. Support recommend collecting at least 3 errorstack dumps for each
spin occurrence.
If there is some reason you cannot login to the DB to get a errorstack from
oradebug you can use OS utilities such as the PSTACK command on SUN SOLARIS
/usr/proc/bin/pstack <pid>
See Note:70609.1 on this
Hanging
-------
In a normal situation it would be expected that v$session_wait column
values would change with the different operations being performed by each
session.
In a hang situation it would be expected that all system events gathered
for a particular session or group of sessions would stay static and no
additional resources are being consumed by the process like cpu and memory
is not incrementing. Given that the session(s) is not requesting to lock
any resource according to the above v$lock query then this situation is called
hanging.
This happens when the server process is waiting on some event to occur to
allow the session to continue its work but If for some reason this event does
not happen, this may then cause a hang.
If for some reason this event does not happen, this may then cause a hang.
The next step is to examine what event is being waited for and then determine
the best course of action from this. For example if the session was waiting
for a write to disk to complete then investigate why the write is taking so
long.
If you are sure that you are facing a Hang situation and you could not fix
the root cause of it then you will need to contact Oracle support services to
help you analyze and solve the Hang problem.
Note that significantly more detailed information can be found by dumping systemstate
information for the instance:
ALTER SESSION SET EVENTS 'IMMEDIATE TRACE NAME SYSTEMSTATE LEVEL XX';
where XX is 266 if the oracle version is 9.2.0.6 or greater or 10.1.0.4 or greater
in other versions use 10. note:3797523.8
a systemstate tracefile will be created in your USER_DUMP_DEST directory.
Support Recommend collecting 3 systemstate dumps for each hang occurrence.
Get the Process ID of the problem session from the V$PROCESS
SELECT pid FROM v$process
WHERE addr =
(SELECT paddr FROM v$session
WHERE sid = sid_of_problem_session);
The systemstate dump includes information for each process.
Find details for an individual process by searching for 'PROCESS <id>'
Find details of currently waiting wait event by doing a search on 'waiting for'.
For other diagnostics see:
Note:402983.1 Database Performance FAQ
No comments:
Post a Comment