Wednesday, January 18, 2012

AWR snapshot creation hangs

Oracle 10g R1

I encountered a problem with the AWR snapshot in which the creation would hang. This phenomenon is due to an Oracle bug #7284976. The Metalink bug report indicates that it is applicable to DB version Enterprise Edition and doesn't list any other affected versions, however I seem to be encountering this on DB version Standard Edition running on IBM AIX platform.

I first noticed the problem when I ran the AWR report and noticed that there were no new snapshots for the day. The snapshots were supposed to be generated every hour, but were not. I thought it was odd, so I checked the DB parameters which all seemed to be in order:
dayneo@RMSP> show parameters statistics

NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------------------------
statistics_level                     string      TYPICAL
timed_os_statistics                  integer     0
timed_statistics                     boolean     TRUE
I then checked the AWR settings, and they were set to snapshot every hour. Lastly, I tried creating a snapshot using SQL/PLUS at which point the snapshot creation hung, never to return.

Checking the session activity, I found that my SQL/PLUS session was waiting on "enq: WF - contention". I decided to check the locks just in case:
dayneo@RMSP> COLUMN blocking_sid     FORMAT 999    HEADING 'Blocking SID'    JUSTIFY right
dayneo@RMSP> COLUMN blocking_user     FORMAT a20    HEADING 'Blocking User'   JUSTIFY left
dayneo@RMSP> COLUMN blocked_sid      FORMAT 999    HEADING 'Blocked SID'     JUSTIFY right
dayneo@RMSP> COLUMN blocked_user     FORMAT a20    HEADING 'Blocked User'    JUSTIFY left
dayneo@RMSP> select s1.sid blocking_sid,
  2       nvl2(s1.username, s1.username || '@' || s1.machine, s1.program) blocking_user,
  3       s2.sid blocked_sid,
  4       nvl2(s2.username, s2.username || '@' || s2.machine, s2.program) blocked_user
  5    from v$lock l1, v$session s1,
  6       v$lock l2, v$session s2
  7   where s1.sid=l1.sid
  8   and s2.sid=l2.sid
  9   and l1.id1 = l2.id1
 10   and l2.id2 = l2.id2
 11   and l1.block = 1
 12   and l2.request > 0
 13  /

Blocking SID Blocking User        Blocked SID Blocked User
------------ -------------------- ----------- --------------------
         119 oracle@wcuss6 (m000)          21 DPOSYS@WORKGROUP\FBO

         133 oracle@wcuss6 (m001)         119 oracle@wcuss6 (m000)
         119 oracle@wcuss6 (m000)         133 oracle@wcuss6 (m001)
The 21 SID is my SQL/PLUS session. Note that SID 119 is blocking it. Also note that SIDs 133 and 119 are blocking each other. This is where the real problem lies and is why the DB is not generating snapshots.

The two SIDs that are blocking each other are both MMON slave processes. One of them is an "Auto-flush slave action" and the other is an "Auto-purge slave action". At this time, I do not have a solution for this. Some forums suggest bouncing the DB. I have not yet tried this as it does not seem to critical at this point.

It would be interesting to know what sort of impact this problem has on DB performance. Will Oracle continue to "auto-tune" if the stats are not being recorded?

1 comment:

dayneo said...

The reason the bug is listed for Enterprise Edition only, is because AWR can only be licensed for Enerprise Edition.

If you are using AWR with Standard Edition, you are doing so in violation of the license agreement.

See AWR and Oracle Licensing for my rant on this.