Oracle database internals by Riyaj

Discussions about Oracle performance tuning, RAC, Oracle internal & E-business suite.

What is ‘rdbms ipc message’ wait event?

Posted by Riyaj Shamsudeen on February 10, 2012

Introduction

There was a question about the wait event ‘rdbms ipc message’ in Oracle-l list. Short answer is that ‘rdbms ipc message’ event means that a process is waiting for an IPC message to arrive. Usually, this wait event can be ignored, but there are few rare scenarios this wait event can’t be completely ignored.

What is ‘rdbms ipc message’ wait means?

It is typical of Oracle Database background processes to wait for more work. For example, LGWR will wait for more work until another (foreground or background ) process request LGWR to do a log flush. In UNIX platforms, wait mechanism is implemented as a sleep on a specific semaphore associated with that process. This wait time is accounted towards database wait events ‘rdbms ipc message’.

Also note that, semaphore based waits are used in other wait scenarios too, not just ‘rdbms ipc message’ waits.

Time to Trace

We will use UNIX utility TRUSS to trace system calls from LGWR; We will enable sql trace on LGWR process. Using the output of these two methods, we will explore this wait event.

-- First we will identify UNIX PID of LGWR
$ ps -ef|grep ora_lgwr_${ORACLE_SID}
  oracle  1508     1   0 13:27:33 ?           0:00 ora_lgwr_solrac1

-- Next we will map that to Oracle session sid. 
select sid,b.serial#,b.program,b.username from v$process a,v$session b
where a.addr=b.paddr
and a.spid=&proc_id
/
Enter value for proc_id: 1508
  SID    SERIAL# PROGRAM                                          USERNAME
----- ---------- ------------------------------------------------ ------------------------------
   19          1 oracle@solrac1 (LGWR)

-- Let's enable sqltrace on LGWR session

SYS@solrac1:1>EXEC DBMS_MONITOR.session_trace_enable(session_id =>19, serial_num=>1, waits=>TRUE, binds=>FALSE);

PL/SQL procedure successfully completed.

Note, Enabling sql trace on LGWR is not a grand idea in a production environment, actually, even in environments where the performance is vital, so, try this test case only at home. Next section below prints few lines from the trace file of that LGWR process.

*** 2012-02-10 13:42:31.333
WAIT #0: nam='rdbms ipc message' ela= 3000112 timeout=300 p2=0 p3=0 obj#=-1 tim=1269283430

*** 2012-02-10 13:42:34.333
WAIT #0: nam='rdbms ipc message' ela= 3000349 timeout=300 p2=0 p3=0 obj#=-1 tim=1272283936 -- line #1

*** 2012-02-10 13:42:37.334
WAIT #0: nam='rdbms ipc message' ela= 3000397 timeout=300 p2=0 p3=0 obj#=-1 tim=1275284461 -- line #2

*** 2012-02-10 13:42:40.334
WAIT #0: nam='rdbms ipc message' ela= 3000185 timeout=300 p2=0 p3=0 obj#=-1 tim=1278284833 -- line #3

*** 2012-02-10 13:42:41.157
WAIT #0: nam='rdbms ipc message' ela= 822820 timeout=300 p2=0 p3=0 obj#=-1 tim=1279107848 -- line #4
WAIT #0: nam='log file parallel write' ela= 604 files=2 blocks=2 requests=2 obj#=-1 tim=1279108889 -- line #5

I tagged the lines with comment to improve readability and refer to those line numbers while reading this paragraph. There are few important points to understand:

  1. As you may know, these trace lines are printed after the completion of waits. In Line #1, LGWR process completed a wait of 3 seconds and woken up at 2012-02-10 13:42:34.333 time. Essentially, LGWR was sleeping for 3 seconds between 13:42:31.333 and 13:42:34.333.
  2. At Line #2 and Line #3, LGWR was sleeping for full three seconds.
  3. But, in Line #4, elapsed time of ‘rdbms ipc message’ is 0.8 seconds, meaning, LGWR slept for 0.8 seconds only (not full three seconds). Line #5 shows that LGWR wrote a log block to the log file.

In a nutshell, LGWR is sleeping on ‘rdbms ipc message’ for full 3 seconds if there is no work to be done. Another process can wakeup LGWR process and trigger a log write as indicated in line #4 and line #5. I will disable SQL trace on LGWR at this point and then truss LGWR process.

LGWR truss

Let’s truss the LGWR process and match the pattern with the trace file. Truss command I am printing here will work in Solaris/hp platform. In Linux, you would use strace -ttT command to trace the UNIX process.

$ truss -d -E -p 1508
Base time stamp:  1328904448.3033  [ Fri Feb 10 14:07:28 CST 2012 ]
....
/1:     12.2573  0.0000 times(0xFFFFFD7FFFDFD790)                       = 276585
/1:     semtimedop(7, 0xFFFFFD7FFFDFD178, 1, 0xFFFFFD7FFFDFD190) (sleeping...)
/1:     15.2576  0.0000 semtimedop(7, 0xFFFFFD7FFFDFD178, 1, 0xFFFFFD7FFFDFD190) Err#11 EAGAIN
/1:     15.2578  0.0000 times(0xFFFFFD7FFFDFD790)                       = 276885
/1:     15.2578  0.0000 times(0xFFFFFD7FFFDFD790)                       = 276885
/1:     semtimedop(7, 0xFFFFFD7FFFDFD178, 1, 0xFFFFFD7FFFDFD190) (sleeping...)
/1:     18.2582  0.0000 semtimedop(7, 0xFFFFFD7FFFDFD178, 1, 0xFFFFFD7FFFDFD190) Err#11 EAGAIN
/1:     18.2586  0.0000 times(0xFFFFFD7FFFDFD790)                       = 277185
/1:     18.2587  0.0000 times(0xFFFFFD7FFFDFD790)                       = 277185
/1:     semtimedop(7, 0xFFFFFD7FFFDFD178, 1, 0xFFFFFD7FFFDFD190) (sleeping...)
/1:     21.2589  0.0000 semtimedop(7, 0xFFFFFD7FFFDFD178, 1, 0xFFFFFD7FFFDFD190) Err#11 EAGAIN
/1:     21.2590  0.0000 times(0xFFFFFD7FFFDFD790)                       = 277485
/1:     21.2591  0.0000 times(0xFFFFFD7FFFDFD790)                       = 277485
/1:     semtimedop(7, 0xFFFFFD7FFFDFD178, 1, 0xFFFFFD7FFFDFD190) (sleeping...)
/1:     23.8895  0.0000 semtimedop(7, 0xFFFFFD7FFFDFD178, 1, 0xFFFFFD7FFFDFD190) = 0
/1:     23.8898  0.0001 kaio(AIOWRITE, 261, 0x6023D000, 6144, 0xFC73DB480CADD400) = 0
/1:     23.8903  0.0000 kaio(AIOWRITE, 263, 0x6023D000, 6144, 0xFC73DDD00C7DD400) = 0
/1:     23.8908  0.0001 kaio(AIOWAIT, 0xFFFFFFFFFFFFFFFF)               = 0
/1:     23.8917  0.0001 kaio(AIOWAIT, 0xFFFFFD7FFFDFC250)               = -2748838584880
/1:     23.8919  0.0000 kaio(AIOWAIT, 0xFFFFFD7FFFDFC250)               = -2748838585528

From the output of truss command above, we can understand LGWR behavior better. Second column prints the time offset from the base timestamp.

  1. At time offset 12.2573, LGWR went in to sleep calling the semtimedop call. I will explain about semtimedop system call shortly.
  2. At time offset 15.2576, LGWR was woken up from the sleep (meaning, that semtimedop call returned after three seconds of timer expiry).
  3. LGWR went to sleep again for three seconds immediately after( possibly because no work to be done) and woken up at 18.2582 time offset. Similar 3 seconds timer expiry occured at 21.2589 time offset also.
  4. It gets interesting at 23.8895 time offset. LGWR was (rudely!) woken up at 23.8895 time offset without allowing the LGWR to sleep for full three seconds ( 2.6 seconds). You can see few kaio calls (kaio kernalized async i/o calls) later and LGWR was submitting asynchronous I/O calls to the redo log file.

semtimedop

UNIX system call semtimedop is used by Oracle processes to sleep with a timer. In this example, LGWR called semtimedop system call with a 3 second timeout. Calling semtimedop system call with 3 seconds timer will suspend the process. Kernel will schedule the UNIX process in CPU (LGWR in this example) if one of these two conditions occur (a) Timer expired as requested by the process or (b) another process modifies this specific semaphore.

In our example, if there is no work, then, the LGWR process will sleep for full 3 seconds until the expiry of semaphore timer; If there is work to be done prior to that 3 seconds timer expiry, then another process will modify the semaphore associated with LGWR and so, Kernel will wakeup LGWR and schedule the LGWR in the CPU. LGWR will perform a log flush sync subsequently.

Advantage with this approach of sleep using semaphore with a timer expiry is that process will not consume any CPU while suspended.

This semaphore based sleep is analogous to a parent sleeping on a Saturday morning. Parent will wakeup if the timer goes off at 7:30 AM or if the kids wakeup prior to that alarm :)

Evidently, you can see that when the LGWR process is sleeping on semaphore, then the time is accounted towards the wait event ‘rdbms ipc message’. You should also remember that semaphores are created as semaphore sets and semctl system call allows the code to modify a specific semaphore in a semaphore set.

semtimedop is used in numerous places

Wait event ‘rdbms ipc message’ is not the only wait event associated with semtimedop system call. There are numerous places semtimedop call is used. Let’s examine a row level locking contention with two processes.

-- In session #1, we update a row.
SYS@solrac1:1>@mypid

SPID
------------------------
3100
SYS@solrac1:1>update rs.t_one set v1=v1 where n1=100;

1 row updated.
----------------
-- In session #2, we will find our PID and then, and then try updating the same row leading to locking contention.
----------------
SYS@solrac1:1>@mypid

SPID
------------------------
3048

SYS@solrac1:1>update rs.t_one set v1=v1 where n1=100;

At this time, PID 3048 is waiting for a lock. We will TRUSS PID 3048, to explore this concept further.

-- In another UNIX window
$ truss -d -E -p 3048
...
11.4971  0.0000 semtimedop(7, 0xFFFFFD7FFFDF69E8, 1, 0xFFFFFD7FFFDF6A00) Err#11 EAGAIN
11.9974  0.0000 semtimedop(7, 0xFFFFFD7FFFDF69E8, 1, 0xFFFFFD7FFFDF6A00) Err#11 EAGAIN
12.4979  0.0000 semtimedop(7, 0xFFFFFD7FFFDF69E8, 1, 0xFFFFFD7FFFDF6A00) Err#11 EAGAIN
12.9982  0.0000 semtimedop(7, 0xFFFFFD7FFFDF69E8, 1, 0xFFFFFD7FFFDF6A00) Err#11 EAGAIN
12.9984  0.0000 times(0xFFFFFD7FFFDF6F50)                       = 613374
12.9985  0.0000 times(0xFFFFFD7FFFDF6F50)                       = 613374
13.4987  0.0000 semtimedop(7, 0xFFFFFD7FFFDF69E8, 1, 0xFFFFFD7FFFDF6A00) Err#11 EAGAIN
13.9992  0.0000 semtimedop(7, 0xFFFFFD7FFFDF69E8, 1, 0xFFFFFD7FFFDF6A00) Err#11 EAGAIN
14.4995  0.0000 semtimedop(7, 0xFFFFFD7FFFDF69E8, 1, 0xFFFFFD7FFFDF6A00) Err#11 EAGAIN
15.0008  0.0000 semtimedop(7, 0xFFFFFD7FFFDF69E8, 1, 0xFFFFFD7FFFDF6A00) Err#11 EAGAIN
15.0010  0.0000 times(0xFFFFFD7FFFDF6F50)                       = 613575
15.0010  0.0000 times(0xFFFFFD7FFFDF6F50)                       = 613575
15.5012  0.0000 semtimedop(7, 0xFFFFFD7FFFDF69E8, 1, 0xFFFFFD7FFFDF6A00) Err#11 EAGAIN
16.0016  0.0000 semtimedop(7, 0xFFFFFD7FFFDF69E8, 1, 0xFFFFFD7FFFDF6A00) Err#11 EAGAIN
16.5023  0.0000 semtimedop(7, 0xFFFFFD7FFFDF69E8, 1, 0xFFFFFD7FFFDF6A00) Err#11 EAGAIN
17.0030  0.0000 semtimedop(7, 0xFFFFFD7FFFDF69E8, 1, 0xFFFFFD7FFFDF6A00) Err#11 EAGAIN
17.0032  0.0000 times(0xFFFFFD7FFFDF6F50)                       = 613775
17.0032  0.0000 times(0xFFFFFD7FFFDF6F50)                       = 613775
17.2466  0.0000 semtimedop(7, 0xFFFFFD7FFFDF69E8, 1, 0xFFFFFD7FFFDF6A00) = 0

Under the cover, PID 3048 is waiting on a semaphore (its own semaphore) in a tight loop with a 0.5 seconds timer expiry. So, essentially, process is sleeping on the semaphore with a 0.5 seconds timer, wakesup after 0.5 seconds of timer expiry, and sleeps again in a loop. At this time, From session #1, I rolled back the changes with a rollback command (commit also will behave exactly the same). Process associated with session #1, identified next waiter in the waiting queue, woke up the waiting process by modifying the semaphore of the waiting process, at 17.2466 time offset. You can see that sleep time for the last semtimedop call is just 0.24 seconds and the semtimedop call returned with a return code of zero. If the timer expired, then EAGAIN is returned as a return code, if not, then 0 is returned. Remember that EAGAIN is not an error and just a timer expiry.

But, in the case of enqueue contention, even though semaphore based sleeps were used, time is accounted towards the wait event ‘enq: TX – row lock contention’. As you see, semaphore based sleeps can be used for waits for many wait evens.

nam='enq: TX - row lock contention' ela= 13930545 name|mode=1415053318 usn<<16 

mypid.sql:

select spid from v$process where
  addr = (select  paddr from v$session where sid=(select sid from v$mystat where rownum =1))
;

Summary

In Summary, this wait event can be ignored. In rare cases, this used to be a platform bug in 7.0 database version, semtimedop will not return even if another process modifies the semaphore. So, generally, this event is an idle event and as such, should be ignored.

7 Responses to “What is ‘rdbms ipc message’ wait event?”

  1. srikumar83 said

    Good one !

  2. Baskar.l said

    Very Nice!

  3. Great post, as always .
    Could You elaborate about ‘reliable message’ which looks similiar but there are a lot of bugs related to this wait .
    Most annoying I’ve found was ‘DBMS_SERVER_ALERT.SET_THRESHOLD HANGS FOREVER AT RELIABLE MESSAGE [ID 794589.1]’.
    Proces waiting on such wait even after being killed with -9 signal is still present in Oracle, and PMON is unable to deal with that .
    I had to restart whole rac to make it stable .
    Looks like PMON is unable to clear such hanging processes .
    Regards
    GregG

  4. chris stephens said

    Great post! I’m very much looking forward to your Hotsos presentation.

  5. Walter said

    you da man!l

  6. Thank you for detailed explanation :) Great post.

  7. miquel said

    Thanks Riyaj, I needed an explanation of the “rdbms ipc message” and this one is great!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 215 other followers

%d bloggers like this: