Oracle database internals by Riyaj

Discussions about Oracle performance tuning, RAC, Oracle internal & E-business suite.

Posts Tagged ‘dtruss’

Log file synch tuning #2

Posted by Riyaj Shamsudeen on November 24, 2008

After reading my earlier blog about log file sync tuning a reader sent an interesting issue, worth blogging about it.


Excessive wait time for ‘log file sync’ event while wait time for ‘log file parallel write’ is minimal. See statspack top events below


Top 5 Timed Events
~~~~~~~~~~~~~~~~~~                                                     % Total
Event                                               Waits    Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time                                                       18,435    31.06
PL/SQL lock timer                                   4,370      12,137    20.45
log file sync                                      57,939       8,565    14.43  <-----
db file sequential read                         1,303,477       7,030    11.85
db file scattered read                            505,147       3,992     6.73

                                                     Total Wait   wait    Waits
Event                               Waits   Timeouts   Time (s)   (ms)     /txn
---------------------------- ------------ ---------- ---------- ------ --------
log file sync                      57,939      6,571      8,565    148      0.5
log file parallel write           215,112          0      1,138      5      1.9

Average wait time for ‘log file sync’ event is 148ms and it is 5ms for ‘log file parallel write’ event. There is an oddity here. Wait time for ‘log file sync’ event seems bit excessive compared to ‘log file parallel write’ event. Ratio between number of waits for ‘log file sync’ and ‘log file parallel write’ is approximately 4. In an average case scenario, if each commit resulted in 4 log writes, then I would expect an Average wait time of 20ms for ‘log file sync’ event. Wait time of 148ms? Needs further analysis.

pstack, pmap and more

Database is running in a Solaris 10 server and database version Since this is a Solaris operating environment, we can use proc utilities to debug this further. We used a script to dump pstack and pmap of LGWR in a loop. This wasn’t much help, but we could guess that LGWR was waiting for I/O since there were many lines in the stack with kaio and aio wait calls. Few lines from pstack output printed below.

6924:	ora_lgwr_ERPPRD2
-----------------  lwp# 1 / thread# 1  --------------------
 ffffffff7ddd4838 semsys   (4, 3, ffffffff7fffd5cc, 1, ffffffff7fffd5b8)
 00000001005b2694 ksliwat (0, 3c0, 77e8a7790, 1037c0b30, 3, 0) + 274
-----------------  lwp# 2 / thread# 2  --------------------
 ffffffff7ddd3a34 kaio     (6, 0, ffffffff7def7f80, 10, ffffffff7def9c18, ffffffff7f700a00)
-----------------  lwp# 3 / thread# 3  --------------------
 ffffffff7ddd2f1c lwp_park (0, 0, 0)
 ffffffff7ddcc748 cond_wait_queue (1038e5a70, 1038e5a80, 0, 0, ffffffff7def9d08, 0) + 28
 ffffffff7ddccca8 cond_wait (1038e5a70, 1038e5a80, 0, 0, 1c00, 0) + 10
 ffffffff7db03a60 _aio_idle (1038e5a00, 1, 1038e5a80, 0, ffffffff7def9c18, 0) + 28
 ffffffff7db034fc _aio_send_sigev (1038e5a00, 0, 104b60, ffffffff7ddd2cc8, ffffffff7db03498, ffffffff7dc08000) + 64


I/O statistics did not provide much clarity either. We need to find if LGWR is suffering from a performance issue. To see if LGWR is suffering from any OS related issues, we need to trace system calls from LGWR and Truss utility provides such a facility. Suffering from truss related paranoia, we didn’t want to run truss against LGWR since that can affect performance, more importantly database stability.

Fortunately, dtruss came handy. dtruss is based upon dtrace utility and by design, dtrace is safe in Solaris Sparc platform.

Keep Reading

Posted in Oracle database internals, Performance tuning | Tagged: , , , , , | 14 Comments »