Oracle database internals by Riyaj

Discussions about Oracle performance tuning, RAC, Oracle internal & E-business suite.

Posts Tagged ‘performance’

Tuning ‘log file sync’ wait events

Posted by Riyaj Shamsudeen on July 7, 2008

In this blog entry, we will discuss strategies and techniques to resolve ‘log file sync’ waits. This entry is intended to show an approach based upon scientific principes, not necessarily a step by step guide. Let’s understand how LGWR is inherent in implementing commit mechanism first.

Commit mechanism and LGWR internals

At commit time, process creates a redo record [ containing commit opcodes] and copies that redo record in to log buffer. Then that process signlas LGWR to write contents of log buffer. LGWR writes from log buffer to log file and signals user process back completing a commit. Commit is considered successful after LGWR write is successful.

Of course, there are minor deviation from this general concept such as latching, commits from plsql block or IMU based commit generation etc. But general philosophy still remains the same.

Signals, semaphore and LGWR

Following section introduces internal workings of commit and LGWR interation in unix platform. There are minor implementation
differences between few unix flavors or platform like NT/XP such as use of post wait drivers instead of semaphores etc. This section is to introduce internals, not necessarily dive deep in to internals. Truss is used to trace LGWR and user process to explain here.

truss command used: truss -rall -wall -fall -vall -d -o /tmp/truss.log -p 22459

[ Word of caution, don't truss LGWR or any background process unless it is absolutely necessary. You can
accidentally cause performance issues, worse yet, shutdown database.]

1. Intiailly, LGWR is sleeping on semaphore using semtimedop or semop call.

22459/1: semtimedop(9, 0xFFFFFD7FFFDFE648, 1, 0xFFFFFD7FFFDFE488) EAGAIN
22459/1:         semnum=15    semop=-1    semflg=0
22459/1:         timeout: 2.060000000 sec
    In the above call,
     9 is semaphore set id visible through ipcs command and semnum=15 is the semaphore for LGWR process in that set.
     next argument is a structure sembuf
	{ unsigned short sem_num;  /* semaphore number */
	  short          sem_op;   /* semaphore operation */
	  short          sem_flg;  /* operation flags */
	third argument is # of semaphores

2. When a session commits, a redo record created and copied in to log buffer. Then that process posts LGWR semaphore
using a semctl call, if LGWR is not active already. Then, process goes to sleep with semtimedop call, in its own semaphore.
Semaphore set id is 9, but semnum is 118 which is for the user process I was tracing.

First semctl calls is posting LGWR. Then process is sleeping on semtimedop call.

27396/1: semctl(9, 15, SETVAL, 1)        = 0
27396/1:semtimedop(9, 0xFFFFFD7FFFDFC128, 1, 0xFFFFFD7FFFDFBF68)=0
27396/1:        semnum=118   semop=-1    semflg=0
27396/1:        timeout: 1.000000000 sec

3. Waiting log writer gets a 0 return code from semtimedop and writes redo records to current redo log file. kaio calls are kernalized asynchronous I/O calls in Solaris platform.

22459/7: pwrite(262, "01 "9E0E i ?".., 1024, 1915904) = 1024
22459/9: pwrite(263, "01 "9E0E i ?".., 1024, 1915904) = 1024
22459/1: kaio(AIOWAIT, 0xFFFFFD7FFFDFE310)               = 1
22459/1:         timeout: 600.000000 sec
22459/9: kaio(AIONOTIFY, 0)                         = 0
22459/7: kaio(AIONOTIFY, 0)                         = 0

4. After successful completion of write(s), LGWR Posts semaphore of waiting process using semctl command.

22459/1: semctl(9, 118, SETVAL, 1)                   = 0

5. User process/Session continues after recieving a return code from semtimedop call, reprinted below.


So, what exactly is ‘log file sync’ wait ?

Commit is not complete until LGWR writes log buffers including commit redo recods to log files. In a nutshell, after posting LGWR to write, user or background processes waits for LGWR to signal back with 1 sec timeout. User process charges this wait time as ‘log file sync’ event.

In the prior section, ‘log file sync’ waits starts at step 2 after semctl call and completes after step 5 above.

Root causes of ‘log file sync’ waits

Root causes of ‘log file sync’, essentially boils down to few scenarios and following is not an exhaustive list, by any means!

1. LGWR is unable to complete writes fast enough for one of the following reasons:
Keep Reading

Posted in Performance tuning | Tagged: , , , | 47 Comments »


Get every new post delivered to your Inbox.

Join 193 other followers