Oracle database internals by Riyaj

August 21, 2009

Exciting seminars in Dallas arena

Filed under: CBO, Oracle database internals, Performance tuning, Presentations — orainternals @ 2:54 pm

If you live in Dallas area, there are two very important seminars coming up:

  1. Tanel Poder is presenting his Advanced Oracle troubleshooting seminar on September 9-11, 2009. .
  2. Jonathan Lewis presenting two day seminar cost based optimization and writing efficient SQL on Oct 28 and Oct 29 2009
  3. .

These two gentlemen needs no introduction. Tanel Poder is an expert in Oracle internals and Jonathan Lewis is a Guru in Cost based optimization and writing efficient SQL.

Tell them I sent ya :-)

Update: I just heard from Tanel and he is unable to make it to Dallas this September. He is rescheduling his seminar and I can’t wait for his seminar. I will update this blog with his rescheduled dates.

August 9, 2009

Shared pool freelists (and durations)

My earlier blog about shared pool duration got an offline response from one of my reader:
” So, you say that durations aka mini-heaps have been introduced from 10g onwards. I have been using Steve Adams’ script shared_pool_free_lists.sql. Is that not accurate anymore?”

Shared pool free lists

I have a great respect for Steve Adams . In many ways, he has been a great virtual mentor and his insights are so remarkable.

Coming back to the question, I have used Steve’s script before and it is applicable prior to Oracle version 9i. In 9i, sub-heaps were introduced. Further, shared pool durations were introduced in Oracle version 10g. So, his script may not be applicable from version 9i onwards. We will probe this further in this blog.

This is the problem with writing anything about internals stuff, they tend to change from version to version and In many cases, our work can become obsolete in future releases(including this blog!).

In version 9i, each sub-heap of the shared_pool has its own free list. In version 10g and 11g, each duration in sub-heap has its own free list. This is visible through x$ksmsp and column x$ksmsp.ksmchdur indicates the duration that chunk belongs to. In 9i, that column always has a value of 1 (at least, that I have experimented so far). In 10g & 11g (up to 11.1.0.7), there are exactly 4 durations in each sub-heap and values range from 1-4 for this column ksmchdur. Each duration has its own free list.

Shared_pool_free_list.sql script

I couldn’t find any script depicting shared pool free list accurately for versions 9i and above. So, I wrote a small script to print shared pool free list information from x$ksmsp. [Warning: Accessing x$ksmsp would add more pressure on shared pool latches.] Complete script can be downloaded from orainternals.com script archive shared_pool_free_lists_9i.sql .

SQL> @"shared_pool_free_lists_9i.sql"
  KSMCHIDX   KSMCHDUR     BUCKET FREE_SPACE FREE_CHUNKS AVERAGE_SIZE    BIGGEST
---------- ---------- ---------- ---------- ----------- ------------ ----------
         1          1          1       1320          33           40         40
                               2       2160          45           48         48
                               3       1400          25           56         56
                               4       2240          35           64         64
                               5       2448          34           72         72
                               6       3360          42           80         80
                               7        792           9           88         88
                               8        672           7           96         96
                               9       1664          16          104        104
                              10       2352          21          112        112
                              11       1680          14          120        120
                              12       1792          14          128        128
                              13       4896          36          136        136
                              14       2016          14          144        144
                              15       3040          20          152        152
                              16       2880          18          160        160
                              88        736           1          736        736
                             252      45824           2        22912      22928
                             253  528212104          41     12883222   15597568
...

Above output shows shared pool free list for sub-heap 1 (ksmchidx=1), duration 1(ksmchdur=1) more technically sga heap (1, 1). In a free list, free Chunks are attached to a bucket classified by chunk size. This facilitates chunk search so that a chunk of a specific size can be easily found. For example, bucket 7 holds chunks bigger than 88 bytes but smaller than 96 bytes. Bucket 8 has chunks bigger than 96 and smaller than 104. Say, for example, for an allocation request of chunk with size of 96 bytes, code can quickly look up bucket 8 and after to find a chunk of size at least 96 bytes. Above script is an excellent indicator to see if the shared pool is fragmented (In this case, shared pool is not fragmented since number of chunks are few).

Problem is that this script also does not depict free list information accurately 10g or 11g. In version 9i, bucket size between two adjacent buckets are 8 until size is less than 1672 bytes and then increases to 48, followed by near 8K difference and so the script works fine in 9i.

...
 case
        when ksmchsiz < 1672 then trunc((ksmchsiz-32)/8)
        when ksmchsiz < 4120 then trunc((ksmchsiz+7928)/48)
        when ksmchsiz < 8216 then 250
        when ksmchsiz < 16408 then 251
        when ksmchsiz < 32792 then 252
        when ksmchsiz = 65560 then 253
   end bucket,...

But, in 10g and 11g, that is not true anymore (at least that I can observe). For example, in the output below, after bucket 239, difference between two adjacent buckets went down to 8. So, the script method is not dependable. Still, I must insist, that script output is a pretty good indicator of shared pool free space fragmentation.

...
 Bucket 239 size=4024 48
 Bucket 240 size=4096 72
 Bucket 241 size=4104 8
 Bucket 242 size=4120 16
 Bucket 243 size=8216 4096
 Bucket 244 size=16408 8192
 Bucket 245 size=16888 480
 Bucket 246 size=16896 8
 Bucket 247 size=16904
...

heapdump_freelist.ksh

Must we need more accurate picture, we can parse heapdump trace file to print it. I wrote another korn shell based awk script heapdump_freelist.sh . This script parses heapdump level 2 trace files and prints the shared pool free list information more accurately. This script works correctly in Oracle versions 9i,10g and 11g. This script accepts an heapdump level 2 trace file as an argument.

./heapdump_freelists.ksh DEV11g_ora_553.trc

-- Heapdump freelist v1.00 by Riyaj Shamsudeen @OraInternals

 This script prints shared pool freelists details
   1. /tmp/shared_pool_freelist.lst - Freelist information for various heaps

Few lines from /tmp/shared_poool_freelist.lst shows that freelist information is printed at more granular level.

------------------------------------------
       sga heap(1,1)
------------------------------------------
Bucket                    1              size=40  Count=        75 Sum=      3000
Bucket                    2              size=48  Count=       176 Sum=      8448
Bucket                    3              size=56  Count=        72 Sum=      4032
Bucket                    4              size=64  Count=        86 Sum=      5504
Bucket                    5              size=72  Count=       173 Sum=     12456
...
Bucket                   42             size=368  Count=         1 Sum=       368
Bucket                   51             size=440  Count=         1 Sum=       440
...
Bucket                  243            size=8216  Count=         3 Sum=     29904
Bucket                  248           size=16912  Count=         1 Sum=     17208
Bucket                  252           size=20552  Count=         2 Sum=     45152
Bucket                  253           size=32792  Count=         1 Sum=     47064
Bucket                  254           size=65560  Count=         6 Sum=  52860352
------------------------------------------
       sga heap(1,2)
------------------------------------------
Bucket                    1              size=40  Count=         1 Sum=        40
Bucket                    3              size=56  Count=         2 Sum=       112
Bucket                    5              size=72  Count=         1 Sum=        72
..

Summary

In summary, shared_pool_free_lists.sql is immensely helpful, but alas internal working details changes from version to version. We have added two scripts to print shared pool free list more accurately:

  1. shared_pool_free_lists.sql to show freespace fragmentation through SQL.
  2. heapdump_freelist.ksh to parse heapdump trace file and print free list information more accurately.

This blog is available in pdf format shared_pool_free_lists_and_duration.pdf .

August 6, 2009

ORA-4031 and Shared Pool Duration

After reading my earlier post on shared pool A stroll through shared pool heap , one of my client contacted me with an interesting ORA-4031 issue. Client was getting ORA-4031 errors and shared pool size was over 4GB ( in a RAC environment). Client DBA queried v$sgastat to show that there is plenty of free memory in the shared pool. We researched the issue and it is worth blogging. Client DBA was confused as to how there can be ORA-4031 errors when the shared pool free memory is few GBs.

Heapdump Analysis

At this point, it is imperative to take heapdump in level 2 and Level 2 is for the shared pool heap dump. [ Please be warned that it is not advisable to take shared pool heap dumps excessively, as that itself can cause performance issue. During an offline conversation, Tanel Poder said that heapdump can freeze instance as his clients have experienced.]. This will create a trace file in user_dump_dest destination and that trace file is quite useful in analyzing the contents of shared pool heap. Tanel Poder has an excellent script heapdump_analyzer . I modified that script adding code for aggregation at heap, extent and type levels to debug this issue further and it is available as heapdump_dissect.ksh . ( with a special permission from Tanel to publish this script.)

Shared pool review

You can read much more about shared pool in my earlier blog entry posted above. Just as a cursory review, shared pool is split in to multiple sub heaps. In 10g, each of those sub heaps are divided in to even smaller sub heaps, let’s call it mini-heaps. For example, in this specific database, there are three sub heaps. Each of those sub heaps are further split in to four mini-heaps (1,0), (1,1), (1,2) and (1,3) each.

	sga heap(1,0)
	sga heap(1,1)
	sga heap(1,2)
	sga heap(1,3)

	sga heap(2,0)
	sga heap(2,1)
	sga heap(2,2)
	sga heap(2,3)

	sga heap(3,0)
	sga heap(3,1)
	sga heap(3,2)
	sga heap(3,3)

One or more extents are allocated to these mini-heaps dynamically as these areas grow over the course of instance life until there is no more free memory to allocate. In this blog entry, let’s focus on the contents of these sub-heaps.
Following shows output of heapdump_dissect script at extent level aggregation. As you can see sga heap (1,0) is allocated with 34 extents of 16M each. [ 16M comes from _ksmg_granule_size.]

Extent level summary
------------------
Sub heap          Extent             Type              Allocation cmt   Size
---------------- -----------------   ---------------   --------------  --------
   sga heap(1,0)          EXTENT 0              free                   1792
   sga heap(1,0)          EXTENT 0              perm   perm            15096064
   sga heap(1,0)          EXTENT 0            R-free                   616
   sga heap(1,0)          EXTENT 0            R-perm   perm            1678560
   sga heap(1,0)          EXTENT 0        R-freeable   reserved stoppe 96
   sga heap(1,0)          EXTENT 1              free                   808
   sga heap(1,0)          EXTENT 1              perm   perm            15097048
   sga heap(1,0)          EXTENT 1            R-free                   1248
   sga heap(1,0)          EXTENT 1            R-perm   perm            1677928
   sga heap(1,0)          EXTENT 1        R-freeable   reserved stoppe 96
   sga heap(1,0)          EXTENT 2              free                   456
   sga heap(1,0)          EXTENT 2              perm   perm            15097400
   sga heap(1,0)          EXTENT 2            R-free                   1679176
   sga heap(1,0)          EXTENT 2        R-freeable   reserved stoppe 96
 ...
   sga heap(1,0)         EXTENT 34              free                   160
   sga heap(1,0)         EXTENT 34              perm   perm            15097696
   sga heap(1,0)         EXTENT 34            R-free                   1679176
   sga heap(1,0)         EXTENT 34        R-freeable   reserved stoppe 96

Duration: A KGH policy

It gets interesting here. Notice that all “sga heap(1,0)” chunks has allocation comment as “perm”. Allocation comment “perm” is passed for permanent chunks. In a nutshell, all permanent chunks are allocated from first mini-heap in each of these sub heaps i.e. sga heap(1,0), sga heap(2,0) and sga heap(3,0) and so on. They are only allocated in the first mini-heap and not in any other mini-heap in these sub-heaps. Another example: PL/SQL DIANA type chunks are allocated only in fourth mini-sub-heap (1,3) (2,3) and (3,3) [in this instance].

   sga heap(1,3)          freeable   PL/SQL DIANA    1318912
...
   sga heap(2,3)          freeable   PL/SQL DIANA    1675296
...
   sga heap(3,3)          freeable   PL/SQL DIANA    1458176

Point is that there is a new KGH policy based upon duration of a chunk. All chunks are classified based upon their type ( or more formally based upon the duration of that chunk) and the chunks with the same duration is allocated from the same shared pool mini-heaps. I think, this is a great idea, especially since only few heaps of shared pool tend to be over worked and that should not unnecessarily flush other parts of the sub-heap. Another important thing to consider is that shared pool extents can be de-allocated and allocated to other parts of SGA. For example, shared pool extents can be deallocated from shared pool and allocated to buffer cache to increase buffer cache size [ These reallocated chunks will be marked with a comment KGH:NO ACCESS ]. By keeping the perm chunks in the first mini-heap, extents can be deallocated quite easily, without the need to move the perm chunks to different areas.

So, What’s the problem?

Drawback is that these chunks are contained in that mini-heap. For example, let’s say that chunks with permanent duration must be allocated and there is no free space in the first mini-heap ( say sga heap(1,0) ) and if a new extent can’t be added to that mini-heap then ORA-4031 error is thrown even if there is plenty of free space in some other mini-heap ( say sga heap(1,1), sga heap(1,2) etc). So permanent chunks will be allocated only from sga heap(N, 0) [ where N is sub-heap id 1, 2,3... ] and if that mini-heap runs out of space, ORA-4031 will be thrown.

This is why client encountered ORA-4031 errors. Even though there was plenty of free memory in other pools, simply permanent chunks can not be allocated in the first mini-heap leading to ORA-4031 errors. Of course, other chunks in that mini-heap can not be deallocated either since those chunks are also permanent chunks. Chances of these errors occurring in other sub-heaps such as sga heap(N,1), sga heap(N,2) etc are less since recreatable/freeable chunks can be flushed to accommodate incoming requests. Quick resolution was to increase shared pool_size temporarily (until we can reduce perm chunk usage due to another issue, which will be resolved soon).

Parameter:_enable_shared_pool_durations

Above undocumented parameter controls this specific KGH policy and default is true. [Note that setting an undocumented parameter needs Oracle support blessing in a production database]. I tested this in my test database setting parameter _enable_shared_pool_durations=false and mini-heaps disappeared. If this parameter is false, then duration of a chunk is not considered. Following example shows that EXTENT 0 in sga heap(1,0) contains non-perm allocations if this parameter is set to FALSE.

EXTENT 0 addr=3CC00000
...
  Chunk 3cfdf23c sz=     4096    freeable  "sql area       "  ds=3CFE4E0C
  Chunk 3cfe023c sz=      540    recreate  "KQR PO         "  latch=41AE8FD4
...

Summary

In essence, duration enabled chunk allocation policy has been introduced in 10g [ I don't know exact version]. This is usually beneficial and ORA-4031 errors are possible even if there is plenty of free space in other mini-heaps.

This can be read in pdf format from Investigations:ORA-4031 and shared pool duration.

Also, there is a patent documents “duration based memory management” available in the web (patent US 6,253,226). Those documents provides in-depth analysis.

Update 1: Looks like these mini-heaps are formally called duration from an Oracle internal resource. There are exactly 4 durations in each sub-heap and each duration has its own free list. x$ksmsp.ksmchdur column indicates the duration of that chunk.

Blog at WordPress.com.