Oracle database internals by Riyaj

Discussions about Oracle performance tuning, RAC, Oracle internal & E-business suite.

Posts Tagged ‘pstack’

Open World 2012 – My Sunday presentation on truss, pstack etc.

Posted by Riyaj Shamsudeen on August 18, 2012

Just a quick note, I will be presenting on “Truss, pstack, pmap, and more” talking about advanced UNIX utilities and how it can be utilized to understand inner working of an application or even Oracle Database Engine.

My timeslot is between 2:15 and 3:15 in Room 2016.

http://blogs.ioug.org/2012/08/15/ioug-at-oracle-openworld-2012-the-sunday-technical-sessions-9302012/

Uploading presentation files. Thanks for attending at OOW12.
pstack_truss_etc

Posted in Oracle database internals, Performance tuning | Tagged: , , , | 1 Comment »

My COLLABORATE 12-IOUG sessions

Posted by Riyaj Shamsudeen on April 19, 2012

If you are attending Collaborate 2012, you might be interested in my content-rich sessions below :

Session Number: 326
Session Title: SCAN, VIP, HAIP, and other RAC acronyms
Session Date/Time/Room: Tue, Apr 24, 2012 (10:45 AM – 11:45 AM) : Surf C

Session Number: 327
Session Title: Internals and Performance Boot Camp: Truss, pstack, pmap, and more
Session Date/Time/Room: Wed, Apr 25, 2012 (03:00 PM – 04:00 PM) : Palm A

Hope to see you there!

Update: I am uploading presentation files. Presentations are much more recent than the document :-)

pstack_truss_etc
2012_327_Riyaj_pstack_truss_doc
SCAN_VIP_HAIP_etc
2012_326_Riyaj_scan_vip_haip_doc

Thanks for attending!

Posted in Oracle database internals, Performance tuning, Presentations, RAC | Tagged: , , , , , , , , , , , , | 1 Comment »

My sessions in RMOUG 2012

Posted by Riyaj Shamsudeen on February 10, 2012

I will be leaving to Denver in few days to talk about the following presentations in RMOUG 2012. Stop by and say hello to me if you intend to attend RMOUG training days.

My sessions in RMOUG 2012 are

  1. Room 402:Session 2: Parallel Execution in RAC – Wednesday 10:45 AM to 11:45AM
  2. Room 4f: Session 10: Troubleshooting RAC background processes – Thursday 1:30PM to 2:30PM
  3. Room 4f: Session 11: A kind and Gentle introduction to RAC – Thursday 2:45 PM to 3:45 PM

Hope to see you there.

Posted in Performance tuning, Presentations, RAC | Tagged: , , , | 2 Comments »

RAC hack session – Tuesday – July 11 2011

Posted by Riyaj Shamsudeen on July 11, 2011

I will be conducting a 1-hour deep dive session about RAC LMS process (and about LGWR processes too if time permits) using advanced UNIX utilities. Read Tanel’s blog entry for details:
RAC hack session

See you there!

Posted in Performance tuning, Presentations, RAC | Tagged: , , , , , , , | 5 Comments »

Group by Hash aggregation

Posted by Riyaj Shamsudeen on September 30, 2010

So, Here I was merrily enjoying OpenWorld 2010 presentations in SFO, I got a call from a client about a performance issue. Client recently upgraded from Version 9i to Version 10g in an E-Business environment. I had the privilege of consulting before the upgrade, so we setup the environment optimally, and upgrade itself was seamless. Client did not see much regression except One query: That query was running for hours in 10g compared to 15 minutes in 9i.

Review and Analysis

Reviewed the execution plan in the development database and I did not see any issues with the plan. Execution plan in development and production looked decent enough. I wasn’t able to reproduce the issue in the development database either. So, the client allowed me to trace the SQL statement in the production database. Since the size of data in few tables is different between production and development databases, we had to analyze the problem in production environment.

I had to collect as much data possible as the tracing was a one-time thing. I setup a small script to get process stack and process memory area of that Unix dedicated server process to collect more details, in addition to tracing the process with waits => true.

Execution plan from the production database printed below. [ Review the execution plan carefully, it is giving away the problem immediately.] One execution of this statement took 13,445 seconds and almost all of it spent in the CPU time. Why would the process consume 13,719 seconds of CPU time?. Same process completed in just 15 minutes in 9i, as confirmed by Statspack reports. [ As a side note, We collected enormous amount of performance data in 9i in the Production environment before upgrading to 10g, just so that we can quickly resolve any performance issues, and you should probably follow that guideline too]. That collection came handy and It is clear that SQL statement was completing in 15 minutes in 9i and took nearly 3.75 hours after upgrading the database to version 10g.

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       10  13719.71   13445.94         27    5086407          0       99938
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       12  13719.71   13445.94         27    5086407          0       99938

     24   HASH GROUP BY (cr=4904031 pr=27 pw=0 time=13240600266 us)
     24    NESTED LOOPS OUTER (cr=4904031 pr=27 pw=0 time=136204709 us)
     24     NESTED LOOPS  (cr=4903935 pr=27 pw=0 time=133347961 us)
 489983      NESTED LOOPS  (cr=3432044 pr=27 pw=0 time=104239982 us)
 489983       NESTED LOOPS  (cr=2452078 pr=27 pw=0 time=91156653 us)
 489983        TABLE ACCESS BY INDEX ROWID HR_LOCATIONS_ALL (cr=1472112 pr=27 pw=0 time=70907109 us)
 489983         INDEX RANGE SCAN HR_LOCATIONS_UK2 (cr=981232 pr=0 pw=0 time=54338789 us)(object id 43397)
 489983        INDEX UNIQUE SCAN MTL_PARAMETERS_U1 (cr=979966 pr=0 pw=0 time=17972426 us)(object id 37657)
 489983       INDEX UNIQUE SCAN HR_ORGANIZATION_UNITS_PK (cr=979966 pr=0 pw=0 time=10876601 us)(object id 43498)
     24      INDEX RANGE SCAN UXPP_FA_LOCATIONS_N3 (cr=1471891 pr=0 pw=0 time=27325172 us)(object id 316461)
     24     TABLE ACCESS BY INDEX ROWID PER_ALL_PEOPLE_F (cr=96 pr=0 pw=0 time=2191 us)
     24      INDEX RANGE SCAN PER_PEOPLE_F_PK (cr=72 pr=0 pw=0 time=1543 us)(object id 44403)

pstack, pmap, and truss

Reviewing pstack output generated from the script shows many function calls kghfrempty, kghfrempty_ex, qerghFreeHashTable etc, implying hash table operations. Something to do with hash table consuming time?

 ( Only partial entries shown ) 
 0000000103f41528 kghfrempty 
 0000000103f466ec kghfrempty_ex 
 0000000103191f1c qerghFreeHashTable 
 000000010318e080 qerghFetch 
 00000001030b1b3c qerstFetch 
...
 0000000103f41558 kghfrempty 
 0000000103f466ec kghfrempty_ex 
 0000000103191f1c qerghFreeHashTable 
 000000010318e080 qerghFetch
 00000001030b1b3c qerstFetch

Truss of the process also showed quite a bit of mmap calls. So, the process is allocating more memory to an hash table?

...
mmap(0xFFFFFFFF231C0000, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 7, 0) = 0xFFFFFFFF231C0000
...
pollsys(0xFFFFFFFF7FFF7EC8, 1, 0xFFFFFFFF7FFF7E00, 0x00000000) = 0
mmap(0xFFFFFFFF231D0000, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 7, 0) = 0xFFFFFFFF231D0000
...

Execution plan again ..

Reviewing the execution plan again showed an interesting issue. I am going to post only two relevant lines from the execution plan below. As you can see that elapsed time at NESTED LOOPS OUTER step is 136 seconds. But the elapsed time at the next HASH GROUP BY step is 13240 seconds, meaning nearly 13,100 seconds spent in the HASH GROUP BY Step alone! Why would the process spend 13,100 seconds in a group by operation? Actual SQL execution took only 136 seconds, but the group by operation took 13,100 seconds. That doesn’t make sense, Does it?

     24   HASH GROUP BY (cr=4904031 pr=27 pw=0 time=13240600266 us)
     24    NESTED LOOPS OUTER (cr=4904031 pr=27 pw=0 time=136204709 us)
...

OFE = 9i

Knowing that time is spent in the Group by operation and that the 10g new feature Hash Grouping method is in use, I decided to test this SQL statement execution with 9i optimizer. The SQL completed in 908 seconds with OFE(optimizer_features_enabled) set to 9.2.0.8 (data is little bit different since production is an active environment). You can also see that SORT technique is used to group the data.

alter session set optimizer_features_enabled=9.2.0.8;

Explain plan :
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch   106985    887.41     908.25     282379    3344916        158     1604754
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   106987    887.41     908.25     282379    3344916        158     1604754

      4   SORT GROUP BY (cr=2863428 pr=0 pw=0 time=37934456 us)
      4    NESTED LOOPS OUTER (cr=2863428 pr=0 pw=0 time=34902519 us)
      4     NESTED LOOPS  (cr=2863412 pr=0 pw=0 time=34198726 us)
 286067      NESTED LOOPS  (cr=2003916 pr=0 pw=0 time=24285794 us)
 286067       NESTED LOOPS  (cr=1431782 pr=0 pw=0 time=19288024 us)
 286067        TABLE ACCESS BY INDEX ROWID HR_LOCATIONS_ALL (cr=859648 pr=0 pw=0 time=13568456 us)
 286067         INDEX RANGE SCAN HR_LOCATIONS_UK2 (cr=572969 pr=0 pw=0 time=9271380 us)(object id 43397)
 286067        INDEX UNIQUE SCAN MTL_PARAMETERS_U1 (cr=572134 pr=0 pw=0 time=4663154 us)(object id 37657)
...

Knowing the problem is in the GROUP BY step, we setup a profile with _gby_hash_aggregation_enabled set to FALSE, disabling the new 10g feature for that SQL statement. With the SQL profile, performance of the SQL statement is comparable to pre-upgrade timing.

This almost sounds like a bug! Bug 8223928 is matching with this stack, but it is the opposite. Well, client will work with the support to get a bug fix for this issue.

Summary

In summary, you can use scientific methods to debug performance issues. Revealing the details underneath, will enable you to come up with a workaround quickly, leading to a faster resolution.
Note that, I am not saying hash group by feature is bad. Rather, we seem to have encountered an unfortunate bug which caused performance issues at this client. I think, Hash Grouping is a good feature as the efficiency of grouping operations can be improved if you have ample amount of memory. That’s the reason why we disabled this feature at the statement level, NOT at the instance level.
This blog in a traditional format hash_group_by_orainternals

Update 1:

I am adding a script to capture pmap and pstack output in a loop for 1000 times, with 10 seconds interval. Tested in Oracle Solaris.

#! /bin/ksh
 pid=$1
 (( cnt=1000 ))
 while  [[ $cnt -gt 0 ]];
  do
        date
        pmap -x $pid
        pstack $pid
        echo $cnt
        (( cnt=cnt-1 ))
        sleep 10
  done

To call the script: assuming 7887 is the UNIX pid of the process.
nohup ./pmap_loop.ksh 7887 >> /tmp/a1.lst 2>>/tmp/a1.lst &

Syntax for the truss command is given below. Please remember, you can’t use pmap, pstack and truss concurrently. These commands stops the process (however short that may be!) and inspects them, so use these commands sparingly. [ I had a client who used to run truss on LGWR process on a continuous(!) basis and database used to crash randomly!]. I realize that pmap/pstack/truss can be scripted to work together, but that would involve submitting a background process for the truss command and killing that process after a small timeout window. That would be a risky approach in a Production environment and So, I prefer to use truss command manually and CTRL+C it after few seconds.

truss -d -E -o /tmp/truss.lst -p 7887

I can not stress enough, not to overuse these commands in a Production environment. Command strace( Linux), tusc (HP) are comparable commands of truss(Solaris).

Posted in CBO, EBS11i, Oracle database internals, Performance tuning | Tagged: , , , , , | 8 Comments »

Log file synch tuning #2

Posted by Riyaj Shamsudeen on November 24, 2008

After reading my earlier blog about log file sync tuning a reader sent an interesting issue, worth blogging about it.

Problem

Excessive wait time for ‘log file sync’ event while wait time for ‘log file parallel write’ is minimal. See statspack top events below

 

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~                                                     % Total
Event                                               Waits    Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time                                                       18,435    31.06
PL/SQL lock timer                                   4,370      12,137    20.45
log file sync                                      57,939       8,565    14.43  <-----
db file sequential read                         1,303,477       7,030    11.85
db file scattered read                            505,147       3,992     6.73

                                                                   Avg
                                                     Total Wait   wait    Waits
Event                               Waits   Timeouts   Time (s)   (ms)     /txn
---------------------------- ------------ ---------- ---------- ------ --------
...
log file sync                      57,939      6,571      8,565    148      0.5
...
log file parallel write           215,112          0      1,138      5      1.9
...

Average wait time for ‘log file sync’ event is 148ms and it is 5ms for ‘log file parallel write’ event. There is an oddity here. Wait time for ‘log file sync’ event seems bit excessive compared to ‘log file parallel write’ event. Ratio between number of waits for ‘log file sync’ and ‘log file parallel write’ is approximately 4. In an average case scenario, if each commit resulted in 4 log writes, then I would expect an Average wait time of 20ms for ‘log file sync’ event. Wait time of 148ms? Needs further analysis.

pstack, pmap and more

Database is running in a Solaris 10 server and database version 9.2.0.8. Since this is a Solaris operating environment, we can use proc utilities to debug this further. We used a script to dump pstack and pmap of LGWR in a loop. This wasn’t much help, but we could guess that LGWR was waiting for I/O since there were many lines in the stack with kaio and aio wait calls. Few lines from pstack output printed below.

6924:	ora_lgwr_ERPPRD2
-----------------  lwp# 1 / thread# 1  --------------------
 ffffffff7ddd4838 semsys   (4, 3, ffffffff7fffd5cc, 1, ffffffff7fffd5b8)
 00000001005b2694 ksliwat (0, 3c0, 77e8a7790, 1037c0b30, 3, 0) + 274
..
-----------------  lwp# 2 / thread# 2  --------------------
 ffffffff7ddd3a34 kaio     (6, 0, ffffffff7def7f80, 10, ffffffff7def9c18, ffffffff7f700a00)
-----------------  lwp# 3 / thread# 3  --------------------
 ffffffff7ddd2f1c lwp_park (0, 0, 0)
 ffffffff7ddcc748 cond_wait_queue (1038e5a70, 1038e5a80, 0, 0, ffffffff7def9d08, 0) + 28
 ffffffff7ddccca8 cond_wait (1038e5a70, 1038e5a80, 0, 0, 1c00, 0) + 10
 ffffffff7db03a60 _aio_idle (1038e5a00, 1, 1038e5a80, 0, ffffffff7def9c18, 0) + 28
 ffffffff7db034fc _aio_send_sigev (1038e5a00, 0, 104b60, ffffffff7ddd2cc8, ffffffff7db03498, ffffffff7dc08000) + 64

dtruss

I/O statistics did not provide much clarity either. We need to find if LGWR is suffering from a performance issue. To see if LGWR is suffering from any OS related issues, we need to trace system calls from LGWR and Truss utility provides such a facility. Suffering from truss related paranoia, we didn’t want to run truss against LGWR since that can affect performance, more importantly database stability.

Fortunately, dtruss came handy. dtruss is based upon dtrace utility and by design, dtrace is safe in Solaris Sparc platform.

Keep Reading

Posted in Oracle database internals, Performance tuning | Tagged: , , , , , | 14 Comments »

 
Follow

Get every new post delivered to your Inbox.

Join 214 other followers