Oracle database internals by Riyaj

Discussions about Oracle performance tuning, RAC, Oracle internal & E-business suite.

Log file synch tuning #2

Posted by Riyaj Shamsudeen on November 24, 2008

After reading my earlier blog about log file sync tuning a reader sent an interesting issue, worth blogging about it.

Problem

Excessive wait time for ‘log file sync’ event while wait time for ‘log file parallel write’ is minimal. See statspack top events below

 

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~                                                     % Total
Event                                               Waits    Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time                                                       18,435    31.06
PL/SQL lock timer                                   4,370      12,137    20.45
log file sync                                      57,939       8,565    14.43  <-----
db file sequential read                         1,303,477       7,030    11.85
db file scattered read                            505,147       3,992     6.73

                                                                   Avg
                                                     Total Wait   wait    Waits
Event                               Waits   Timeouts   Time (s)   (ms)     /txn
---------------------------- ------------ ---------- ---------- ------ --------
...
log file sync                      57,939      6,571      8,565    148      0.5
...
log file parallel write           215,112          0      1,138      5      1.9
...

Average wait time for ‘log file sync’ event is 148ms and it is 5ms for ‘log file parallel write’ event. There is an oddity here. Wait time for ‘log file sync’ event seems bit excessive compared to ‘log file parallel write’ event. Ratio between number of waits for ‘log file sync’ and ‘log file parallel write’ is approximately 4. In an average case scenario, if each commit resulted in 4 log writes, then I would expect an Average wait time of 20ms for ‘log file sync’ event. Wait time of 148ms? Needs further analysis.

pstack, pmap and more

Database is running in a Solaris 10 server and database version 9.2.0.8. Since this is a Solaris operating environment, we can use proc utilities to debug this further. We used a script to dump pstack and pmap of LGWR in a loop. This wasn’t much help, but we could guess that LGWR was waiting for I/O since there were many lines in the stack with kaio and aio wait calls. Few lines from pstack output printed below.

6924:	ora_lgwr_ERPPRD2
-----------------  lwp# 1 / thread# 1  --------------------
 ffffffff7ddd4838 semsys   (4, 3, ffffffff7fffd5cc, 1, ffffffff7fffd5b8)
 00000001005b2694 ksliwat (0, 3c0, 77e8a7790, 1037c0b30, 3, 0) + 274
..
-----------------  lwp# 2 / thread# 2  --------------------
 ffffffff7ddd3a34 kaio     (6, 0, ffffffff7def7f80, 10, ffffffff7def9c18, ffffffff7f700a00)
-----------------  lwp# 3 / thread# 3  --------------------
 ffffffff7ddd2f1c lwp_park (0, 0, 0)
 ffffffff7ddcc748 cond_wait_queue (1038e5a70, 1038e5a80, 0, 0, ffffffff7def9d08, 0) + 28
 ffffffff7ddccca8 cond_wait (1038e5a70, 1038e5a80, 0, 0, 1c00, 0) + 10
 ffffffff7db03a60 _aio_idle (1038e5a00, 1, 1038e5a80, 0, ffffffff7def9c18, 0) + 28
 ffffffff7db034fc _aio_send_sigev (1038e5a00, 0, 104b60, ffffffff7ddd2cc8, ffffffff7db03498, ffffffff7dc08000) + 64

dtruss

I/O statistics did not provide much clarity either. We need to find if LGWR is suffering from a performance issue. To see if LGWR is suffering from any OS related issues, we need to trace system calls from LGWR and Truss utility provides such a facility. Suffering from truss related paranoia, we didn’t want to run truss against LGWR since that can affect performance, more importantly database stability.

Fortunately, dtruss came handy. dtruss is based upon dtrace utility and by design, dtrace is safe in Solaris Sparc platform.

Test setup

As a test setup:

  1. We connected to the database locally and identified foreground process associated with our session.
  2. We started dtruss on LGWR and the foreground process concurrently.
  3. We also turned on sqltrace on this foreground process.

Then, we generated couple of MBs of redo from our session, followed by a commit.

Few lines from dtruss output for foreground process printed below:

...

 13934/1:     27767      10 write(0x5, "\n", 0x1)		 = 1 0	........(1)
 13934/1:     27803       8 semsys(0x0, 0x3, 0x9)		 = 0 0    .........(2)
 13934/1:     27996  376193 semsys(0x4, 0x3, 0xFFFFFFFF7FFFAD0C) = 0 0 .........(3)
 13934/1:     28105      55 write(0x5, "WAIT #1: nam='log file sync' ela= 367609 p1=5354 p2=0 p3=0", 0x3A)		 = 58 0
...

From the output above:

  1. Line #1: Foreground process posted LGWR using semsys call.
  2. Line #2: Then user process went to sleep for 376.1 ms on its own semaphore (waiting for LGWR post aka ‘log file sync’ event).
  3. Line #3: shows that user process wrote event wait timing in to the sqltrace trace file.

dtruss output of LGWR

Let’s look at dtruss output of LGWR. Few lines from dtruss output for LGWR process printed below:

PID/LWP    RELATIVE  ELAPSD SYSCALL(args) 		 = return
...
  6924/5:    240044  230587 pwrite(0x103, "", 0x100000)		 = 1048576 0 ..........(1)
  6924/5:    240091      39 kaio(0x4, 0x0, 0x1038E9DD0)		 = 0 0
  6924/1:   2447794  232035 kaio(0x2, 0xFFFFFFFF7FFFD400, 0x0)		 = 1 0
  6924/7:    226276  354665 pwrite(0x103, "", 0x6DC00)		 = 449536 0
  6924/7:    226307      26 kaio(0x4, 0x0, 0x1038EB1F0)		 = 0 0
  6924/1:   2447839  123214 kaio(0x2, 0xFFFFFFFF7FFFD400, 0x0)		 = 1 0
  6924/1:   2457909      60 semsys(0x2, 0x3, 0xFFFFFFFF7FFFCEA4)		 = 0 0
  6924/1:   2468590      48 lwp_park(0x1, 0x9, 0xFFFFFFFF7F700200)		 = 0 0
  6924/1:   2468606      12 lwp_park(0x1, 0xB, 0xFFFFFFFF7F700200)		 = 0 0
  6924/1:   2468632      10 kaio(0x2, 0xFFFFFFFFFFFFFFFF, 0x1)		 = -1 Err#22
  6924/9:    249884  226754 pwrite(0x103, "", 0x100000)		 = 1048576 0
  6924/1:   2468669  228315 kaio(0x2, 0xFFFFFFFF7FFFD400, 0x0)		 = 1 0
  6924/9:    249923      25 kaio(0x4, 0x0, 0x1038EB1F0)		 = 0 0
  6924/1:   2468720  124856 kaio(0x2, 0xFFFFFFFF7FFFD400, 0x0)		 = 1 0
  6924/11:    217949  352556 pwrite(0x103, "", 0x7FE00)		 = 523776 0
  6924/11:    217977      23 kaio(0x4, 0x0, 0x1038E9DD0)		 = 0 0

Above output shows [Line #1] shows that LGWR wrote 1MB of redo using pwrite call and it took approximately 230ms for the write call to complete. Indeed LGWR took 230ms for log file writes. But, ‘log file parallel write’ average is only 5ms. Is average misleading and hiding longer write times?

Searching for ‘log file parallel write’ in google, I hit Jonathan Lewis blog entry . Of course, this is a bug and Jonathan has documented this bug much better. In summary, wait time for ‘log file parallel write’ is not printed correctly.

Back to our issue: Simply put, it is an I/O problem that needs further review. It is unfortunate that ‘log file parallel write’ wait time was confusing the reader. Reader is planning to work with storage layer group to review and improve I/O performance.

12 Responses to “Log file synch tuning #2”

  1. [...] of skill of the people doing the trouble – and I see that Riyaj Shamsudeen recently published a note about the same problem. Approaching it from a completely different angle, using exactly the right tool for the task to [...]

  2. Hans-Peter Sloot said

    Hi Riyaj,

    We had the filesystems remounted with forcedirectio.
    The result was that the average log file sync dropped from 70ms to about 7ms.
    I dropped a post at opensolaris.org and was told by Dave Miller that there is a bug present fixed in 10.2.0.4
    which causes directio not to be used.

    Just to let you know.

    regards Hans-Peter

  3. Hi Hans-Peter
    Thanks for providing more information here.
    I saw your excellent dtrace script and discussion in this open solaris thread . Excellent work. As visible in that thread, Hans-Peter, Jim Mauro and others were able to pinpoint that, problem is really due to ufs single writer lock and read-modify-write issue.
    Further, David Miller identified this as bug 5752399 and there is a mandatory patch for 10.2.0.3 version for veritas or solsite disk. One of the line in the bug text clearly indicate:
    “You may also observe that direct I/O is not working, despite having the parameter filesystemio_options set to an appropriate value in your init.ora or spfile.ora file”

    Essentially, file systems need to be (re)mounted with forcedirectio for direct I/O to work correctly.

    Interesting issue.

    Cheers
    Riyaj

  4. Darshan said

    Hi Riyaj,

    I have read couple of your articles. All are well composed and lots of good information.

    I have similar issue on one of the DB on Solaris 10, Oracle 10.2.0.3.

    From ASH details I see sometimes Log file sync time approching more then second even though there is no much load on the system, or lots of redo generation. At the same time I do not see “log file parallel write” taking too much time. I do not want try out kernal tracing tools like truss or dtrace on very critical production DB. Any other ways to go further in debugging. Here is the ASH for that spike with LGWR and other sessions waits.

    SAMPLE_TIME SID# SEQ# EVENT P1 P2 P3 WAIT_TIME TIME_WAITED mic sec
    ————————– ————— ———- ————————- ———- ———- ———- ———- ———–
    29-JAN-10 08.01.27.685 AM 4401,1 44462 log file parallel write 2 6 2 0 767
    29-JAN-10 08.01.29.705 AM 4401,1 44904 0 0 0 1104 0
    29-JAN-10 08.01.39.855 AM 4401,1 47010 0 0 0 1250 0
    29-JAN-10 08.01.41.875 AM 4401,1 47457 0 0 0 775 0
    29-JAN-10 08.01.43.905 AM 1297,356 52089 log file sync 5116 0 0 0 983094
    29-JAN-10 08.01.43.905 AM 1301,551 23265 log file sync 5116 0 0 0 977351
    29-JAN-10 08.01.43.905 AM 1327,60 16461 log file sync 5116 0 0 0 985489
    29-JAN-10 08.01.43.905 AM 1329,1945 34813 log file sync 5116 0 0 0 979476
    29-JAN-10 08.01.43.905 AM 1330,4 32679 log file sync 5116 0 0 0 979009
    29-JAN-10 08.01.43.905 AM 1345,314 23435 log file sync 5116 0 0 0 982434
    29-JAN-10 08.01.43.905 AM 1357,3 59819 log file sync 5116 0 0 0 986190
    29-JAN-10 08.01.43.905 AM 1369,114 58771 log file sync 5116 0 0 0 982554
    29-JAN-10 08.01.43.905 AM 1377,199 42979 log file sync 5116 0 0 0 983632
    29-JAN-10 08.01.43.905 AM 1437,20533 17258 log file sync 5116 0 0 0 982394
    29-JAN-10 08.01.43.905 AM 1468,54740 44957 log file sync 5116 0 0 0 985416
    29-JAN-10 08.01.43.905 AM 1881,30949 51016 log file sync 5116 0 0 0 985110
    29-JAN-10 08.01.43.905 AM 2105,63152 30301 log file sync 5116 0 0 0 979693
    29-JAN-10 08.01.43.905 AM 2136,54766 60438 log file sync 5116 0 0 0 977066
    29-JAN-10 08.01.43.905 AM 3500,298 34960 log file sync 5116 0 0 0 979268
    29-JAN-10 08.01.43.905 AM 3514,2490 12034 log file sync 5109 0 0 0 986379
    29-JAN-10 08.01.43.905 AM 3534,1 2637 log file sync 5116 0 0 0 978855
    29-JAN-10 08.01.43.905 AM 3591,796 32231 log file sync 5116 0 0 0 986208
    29-JAN-10 08.01.43.905 AM 3597,296 19271 log file sync 5116 0 0 0 977616
    29-JAN-10 08.01.43.905 AM 3894,35906 21366 log file sync 5116 0 0 0 981770
    29-JAN-10 08.01.43.905 AM 4401,1 47716 LGWR wait for redo copy 9 0 0 0 19655
    29-JAN-10 08.01.45.545 AM 1297,356 52089 log file sync 5116 0 0 0 986282
    29-JAN-10 08.01.45.545 AM 1301,551 23265 log file sync 5116 0 0 0 943796
    29-JAN-10 08.01.45.545 AM 1324,15 65205 log file sync 5116 0 0 0 977774
    29-JAN-10 08.01.45.545 AM 1327,60 16461 log file sync 5116 0 0 0 777708
    29-JAN-10 08.01.45.545 AM 1329,1945 34813 log file sync 5116 0 0 0 914354
    29-JAN-10 08.01.45.545 AM 1330,4 32679 log file sync 5116 0 0 0 787733
    29-JAN-10 08.01.45.545 AM 1345,314 23435 log file sync 5116 0 0 0 357618
    29-JAN-10 08.01.45.545 AM 1357,3 59819 log file sync 5116 0 0 0 986153
    29-JAN-10 08.01.45.545 AM 1359,101 7231 log file sync 5116 0 0 0 980302
    29-JAN-10 08.01.45.545 AM 1369,114 58771 log file sync 5116 0 0 0 337973
    29-JAN-10 08.01.45.545 AM 1377,199 42979 log file sync 5116 0 0 0 986484
    29-JAN-10 08.01.45.545 AM 1410,405 31493 log file sync 5116 0 0 0 985918
    29-JAN-10 08.01.45.545 AM 1437,20533 17258 log file sync 5116 0 0 0 533481
    29-JAN-10 08.01.45.545 AM 1468,54740 44957 log file sync 5116 0 0 0 494599
    29-JAN-10 08.01.45.545 AM 1878,971 3871 log file sync 5116 0 0 0 866648
    29-JAN-10 08.01.45.545 AM 1881,30949 51016 log file sync 5116 0 0 0 866072
    29-JAN-10 08.01.45.545 AM 2105,63152 30301 log file sync 5116 0 0 0 581489
    29-JAN-10 08.01.45.545 AM 2136,54766 60438 log file sync 5116 0 0 0 986590
    29-JAN-10 08.01.45.545 AM 3260,11235 24180 log file sync 5116 0 0 0 619846
    29-JAN-10 08.01.45.545 AM 3495,632 38628 log file sync 5116 0 0 0 442214
    29-JAN-10 08.01.45.545 AM 3500,298 34960 log file sync 5116 0 0 0 986723
    29-JAN-10 08.01.45.545 AM 3501,34 63119 log file sync 5116 0 0 0 480076
    29-JAN-10 08.01.45.545 AM 3502,218 42923 log file sync 5116 0 0 0 978969
    29-JAN-10 08.01.45.545 AM 3514,2490 12034 log file sync 5109 0 0 0 442955
    29-JAN-10 08.01.45.545 AM 3534,1 2637 log file sync 5116 0 0 0 318910
    29-JAN-10 08.01.45.545 AM 3556,69 2417 log file sync 5116 0 0 0 592775
    29-JAN-10 08.01.45.545 AM 3591,796 32231 log file sync 5116 0 0 0 406718
    29-JAN-10 08.01.45.545 AM 3596,444 26541 log file sync 5116 0 0 0 651510
    29-JAN-10 08.01.45.545 AM 3597,296 19271 log file sync 5116 0 0 0 495818
    29-JAN-10 08.01.45.545 AM 3602,1017 42070 log file sync 5116 0 0 0 771974
    29-JAN-10 08.01.45.545 AM 3622,232 51276 log file sync 5116 0 0 0 872151
    29-JAN-10 08.01.45.545 AM 3813,61056 37110 log file sync 5116 0 0 0 601072
    29-JAN-10 08.01.45.545 AM 3894,35906 21366 log file sync 5116 0 0 0 826675
    29-JAN-10 08.01.45.545 AM 4401,1 47716 LGWR wait for redo copy 9 0 0 0 19473
    29-JAN-10 08.01.48.575 AM 4401,1 48286 log file parallel write 2 2 2 0 772
    29-JAN-10 08.01.57.725 AM 4401,1 50746 0 0 0 592 0
    29-JAN-10 08.01.58.735 AM 4401,1 51050 0 0 0 472 0
    29-JAN-10 08.02.02.775 AM 4401,1 51917 log file parallel write 2 10 2 0 1002

  5. Darshan
    Sorry for the late reply. Looks like, replying automatic email from wordpress does not show up in the comment!

    Have you checked bugs with ‘log file sync’ event waits in 10.2.0.3. There are couple of bugs such as bug 8842553. It is quite possible that you are hitting one of these bugs.
    Also, Is LGWR getting enough CPU? Is the file system properly mounted with direct I/O? There are many things that can go wrong, so, it may be essential to dtrace LGWR to see why LGWR is not performing. What is LGWR waiting as per v$active_session__history?
    dtrace is, by design, safe. So, there is no downside to it. Let me know, if you want a script to dtrace LGWR writes.
    Cheers
    Riyaj

  6. Darshan said

    Riyaj,

    Thanks for your response. We do not have CPU issue. It is around 10% only.
    We are using Vxfs veritas volume manager on netapp. I provided v$active_session__history waits in my first comment, where we do not see it is spending much time on I/o “log file parallel write”.

    Could you please provide dtrace script for lgwr? That should help to figure where it is spending time.

  7. Darshan
    Try this script:

    http://www.orainternals.com/scripts_perf1.php

    Look for “dtrace script to trace Oracle processes”.

    http://www.orainternals.com/viewfile.php?id=46

    Beware that, in kernel land, functions are not well documented and can be misleading if you don’t understand it properly. Many function calls are overloaded.

    Cheers
    Riyaj

  8. Darshan said

    Riyaj,

    Thanks for fast response. I will try and let you what all we see.

    Regards,
    Darshan

  9. [...] Riyaj Shamsudeen – Log file sync tuning [...]

  10. qihua said

    For your following dtruss output
    6924/5: 240044 230587 pwrite(0×103, “”, 0×100000) = 1048576 0 ……….(1)
    6924/5: 240091 39 kaio(0×4, 0×0, 0x1038E9DD0) = 0 0
    6924/1: 2447794 232035 kaio(0×2, 0xFFFFFFFF7FFFD400, 0×0) = 1 0

    We can’t see whether the pwrite and kaio is on the same file or not. When I trussed a lgwr on my database, I found pwrite is for controfile, and kaio is for online redo log.

    Can you also add the file handler information to pwrite and kaio so it is clear which file was written?

  11. qihua said

    I also have a question about the redo write.

    the statistics of “redo write” is the number of time lgwr writes the data, right? For our database, the times of write on OS level (vxstat) is about 6 times as the “redo write”, we didn’t multiplex the redo, and redo file is quick io file. The redo volume is separated from other data volume, so the vxstat reported on redo volume should be equal to the write IO from lgwr. I can’t understand why one “redo write” was broken into 6 OS IO.

  12. [...] Riyaj Shamsudeen – Log file sync tuning [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 192 other followers

%d bloggers like this: