Oracle database internals by Riyaj

September 13, 2009

Is plan_hash_value a final say?

I was reviewing a performance issue with a client recently. Problem is that increased global cache waits causing application slowdown affecting few critical business functions. Using one of my script gc_traffic.sql > and graphing the results with Excel spreadsheet, it is established that there is a marked increase in GC traffic today compared to week earlier. Similar jobs runs every day and so comparing two week days is sufficient to show the GC traffic increase. Graph is between total blocks and AWR snap time in 30 minutes interval. [Click the picture below to review the graph clearly.]

Identifying the object creating this increased GC traffic is essential to identify root cause. We were able to quickly determine that this increase in GC traffic was localized around few SQL statements using ADDM and AWR reports. We decided to focus on one SQL with an obvious increase in elapsed time compared to prior week. So, first question asked, is there a change in the plan? plan_hash_value was reviewed and quickly determined that there is no change in the plan_hash_value.

Little bit of history, there were few changes performed by the vendor over the weekend as a part of few bug fixes. Vendor’s argument was that since there is no change to the plan_hash_value, SQL access plan did not change and so, this can’t be due to vendor changes. My Client’s argument was that there were no changes to the environment and problem started after the application changes.

There are many different things that can go wrong without changes to the execution plan. We can ignore those conditions for now since (a) there has been no changes to the environment (b) no visible changes to the data (c) no error message and average CR recv time is consistent with prior weeks. Well, Let’s cut to the chase. It boiled down to a question “Can SQL plan change without a change in plan_hash_value?”. What do you think? Please answer in the poll below.

plan_hash_value and hash_value

Hash_value of a SQL statement is generated from the text of an SQL statement and plan_hash_value is generated from the execution plan of that SQL statement[ More accurately, from that child cursors' execution plan and exactly what is involved in generating plan_hash_value is not published]. It is a general belief that plan_hash_value will change even if there is a slightest change in the execution plan. But, that is not always the case!

Test case

We will use a small table to explore this issue.

prompt Test case #1: Initial test case with index on columns (n1, n2).
prompt ==========
create table t1 (n1 number, n2 number, v1 varchar2(100));
create index t1_n1 on t1(n1,n2);

explain plan for select * from t1 where n1=:b1 and n2=:b2;
select * from table (dbms_xplan.display);
prompt plan hash value in this case is 626762252

Plan hash value: 626762252
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    53 |  4134 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |    53 |  4134 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_N1 |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("N1"=TO_NUMBER(:B1) AND "N2"=TO_NUMBER(:B2))

So, we got the plan_hash_value as highlighted above. In the following test case we will recreate the index reordering the columns as (n2, n1).

prompt Test case #2: Index with re-ordered columns
prompt ==========
drop index t1_n1;
create index t1_n1 on t1(n2,n1);

Plan hash value: 626762252

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    53 |  4134 |     9   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |    53 |  4134 |     9   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_N1 |  5298 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("N2"=TO_NUMBER(:B2) AND "N1"=TO_NUMBER(:B1))

Notice that in the test result above filter predicate changed reflecting index column reordering. But the plan_hash_value did not change. Point is that execution plan can change without a change in plan_hash_value (due to change in the underlying tables).

Let’s modify that index dropping a column from the index in the next test case.

prompt Test case #3: dropping a column from index t1_n1.

drop index t1_n1;
create index t1_n1 on t1(n1);

explain plan for select * from t1 where n1=:b1 and n2=:b2;

Plan hash value: 626762252

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    53 |  4134 |    41   (3)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| T1    |    53 |  4134 |    41   (3)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_N1 |  2119 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("N2"=TO_NUMBER(:B2))
   2 - access("N1"=TO_NUMBER(:B1))

We see that predicates changed but the plan_hash_value did not change. In the test case below, we will modify index to be a function based index and test this SQL statement. There are also few more self-evident test cases below.

prompt Test case #3: Index is a function based index. Still, no change in the plan_hash_value.

create index t1_n1 on t1(to_char(n1));
explain plan for select * from t1 where to_char(n1)=:b1 and n2=:b2;

Plan hash value: 626762252
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    53 |  4134 |   127   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| T1    |    53 |  4134 |   127   (1)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_N1 |  2119 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("N2"=TO_NUMBER(:B2))
   2 - access(TO_CHAR("N1")=:B1)

prompt Test case #4: Different schema and same SQL. plan_hash_value did not change.

Plan hash value: 626762252
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |    78 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |    78 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_N1 |     3 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("N2"=TO_NUMBER(:B2))
   2 - access("N1"=TO_NUMBER(:B1))

plan_hash_value is also case sensitive for index names. In the test cases below, we will create case sensitive indices.

prompt Test case #5: Lower case index_name.. plan_hash_value changed.
create index "t1_n1" on t1(n1,n2);
explain plan for select * from t1 where n1=:b1 and n2=:b2;

Plan hash value: 2252949961
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    53 |  4134 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |    53 |  4134 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | t1_n1 |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

prompt Test case #6: Upper case index_name.. plan_hash_value did not change.
create index "T1_N1" on t1(n1,n2);
explain plan for select * from t1 where n1=:b1 and n2=:b2;
Plan hash value: 626762252

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    53 |  4134 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |    53 |  4134 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_N1 |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

prompt Test case #7: Space in the index name changes plan_hash_value though.
create index "T1_N1 " on t1(n1,n2);
explain plan for select * from t1 where n1=:b1 and n2=:b2;
Plan hash value: 1377541522
--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |    53 |  4134 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1     |    53 |  4134 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_N1  |     1 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Summary

In summary, plan_hash_value is a very good indicator to see if the plan changed or not. Don’t get me wrong, I also use plan_hash_value, but in addition to comparing CPU time and elapsed time. Execution plan can change even when there is no change to the plan_hash_value. Salient points from these test cases are:

  1. Plan_hash_value is dependent upon partial execution plan not on complete execution plan.
  2. If a predicate is moved from filter_predicate to access_predicate or vice-versa, it doesn’t affect plan_hash_value.
  3. Changes in the parallelism of the queries does not affect plan_hash_value. For example, if a query used 4 parallel slaves today and 16 parallel slaves yesterday, that change is not visible through plan_hash_value. This behavior is expected as parallel_adaptive_multiuser can allocate slaves depending upon the state of that instance.
  4. Plan_hash_value is case-sensitive to index/table names. Also sensitive to white-space characters.
  5. Plan_hash_value is not sensitive to index types. For example, if the index type is a function based index as in our test case #4, as long as, index name did not change plan_hash_value will remain the same.
  6. Plan_hash_value is not sensitive to schema either. SQL statement accessing different schemas can have same plan_hash_value too.

Back to our problem. One of the index was recreated removing a column and caused optimizer to apply filter predicates at table level increasing number of accesses to table block tremendously, leading to more logical reads, more Global cache waits etc. This problem was amplified since this SQL was executed very frequently and concurrently from all RAC instances.

In my client’s defense, this application change was tested thoroughly. But, alas, test data chosen for this performance test was not probing this specific issue. This performance issue did not show up in development. Essentially, chosen data in the performance benchmark suite was not an excellent choice.

As they say in Gaelic language “Go raimh maith agat” to my client for allowing me to post this blog.
This can be read in traditional format in
plan_hash_value_and_gc
Update 1: Added the document and also corrected a typo.

August 21, 2009

Exciting seminars in Dallas arena

Filed under: CBO, Oracle database internals, Performance tuning, Presentations — orainternals @ 2:54 pm

If you live in Dallas area, there are two very important seminars coming up:

  1. Tanel Poder is presenting his Advanced Oracle troubleshooting seminar on September 9-11, 2009. .
  2. Jonathan Lewis presenting two day seminar cost based optimization and writing efficient SQL on Oct 28 and Oct 29 2009
  3. .

These two gentlemen needs no introduction. Tanel Poder is an expert in Oracle internals and Jonathan Lewis is a Guru in Cost based optimization and writing efficient SQL.

Tell them I sent ya :-)

Update: I just heard from Tanel and he is unable to make it to Dallas this September. He is rescheduling his seminar and I can’t wait for his seminar. I will update this blog with his rescheduled dates.

May 6, 2009

COLLABORATE 2009 presentation: 11g performance new feature.

I just presented about Oracle 11g new features specific to performance in COLLABORATE 2009, Orlando Florida. You can download presentation and paper from here:

11g performance specific new features – presentation
11g performance specific new features – paper

I met couple of new friends and many familiar faces.

Catherin Devlin introduced sqlpython tool to me. This tool is an user friendly replacement for sqlplus with many UNIX like goodies. This tool will be an useful addition to command line tools. You can see installation instructions for sqlpython here.

I also attended an interesting application performance panel discussion with fellow Oakie Mark W Farnheim, Mike Brown and Sandra Vucinic. We discussed few customer issues.

An interesting problem discussion is worth mentioning. A client clones production database to development database using full rapidclone methodology. But, access plans for a SQL statement is different between production and development, even though everything was exactly the same. I am hoping, that client will send 10053 trace files from both databases and I will blog about it if I get anything from that client. I think, bind peeking is causing the plan difference, but need to analyze 10053 trace file to confirm that. Another possibility is that, may be , he is not cloning Oracle Software and some software patches are not existing in cloned environment. Need trace files to proceed further here.

Of course, I met few other friends Jeremy Schneider, Dan Norris to mention few.

April 18, 2009

DOUG presentation: Why does optimizer hate my SQL?

I presented about Cost based optimizer explaining why some times CBO chose inefficient access plan, even though, there is an efficient plan in the search space. This entry is to post presentation slides and they can be downloaded from Why_optimizer_hates_my_sql

.

Update: Updated presentation after Greg’s comments.
Update2: Further bug fixes in presentation and script.
Update3: Updates after Randolf’s comments.

March 13, 2009

Dynamic_plan_table, x$kqlfxpl and extreme library cache latch contention

We had an interesting latch contention issue with a client database worth sharing. Client complained that they are seeing increased library cache latch contention in production with no logical explanation (yet). So, client wanted us to find the root cause. This performance issue seems to have started recently and occurs intermittently and also resolves by itself. Client is planning to upgrade to 10g in 4 weeks time frame and this issue need to be resolved to avoid any delays in 10g upgrade.

From statspack analysis and various statistics analysis, it is visible that there is an increase in library cache latch contention, but nothing jumps up as a problem. Of course, this is a very active application [ 5 node RAC application] with very high number of literal SQLs but that is nothing new, it is always been this way. No new dynamic SQLs, no DDLs invalidating cursors or anything of that sort. I started watching production instance, in real time, waiting for the re-occurrence of this issue. It occurred!

latchprofx for rescue

At this very juncture, instance was suffering from latch contention and It was visible that sessions were starting to queue up for latches.

I was ready with necessary tools. Tanel Poder has written a great tool latchprofx to sample v$latch_holder with high frequency [ Actually, it uses x$ tables but you get the idea ]. In this case, high frequency sampling is a must since this performance problem disappears in few minutes or so. Executed latcprofx script for library cache latches and 10000 samples as printed below.

SQL>  @/tmp/latchprofx  sid,name % library 10000

-- LatchProfX 1.06 by Tanel Poder ( http://www.tanelpoder.com )

  SID NAME                                      Held  Held %     Held ms
------ ----------------------------------- ---------- ------- -----------
   140 library cache                             9555   95.55   25167.870
   185 library cache                              198    1.98     521.532
   245 library cache                               97     .97     255.498
   185 library cache pin                           69     .69     181.746
   245 library cache pin                           39     .39     102.726
   240 library cache                               33     .33      86.922
   240 library cache pin                            6     .06      15.804
   227 library cache                                2     .02       5.268
   227 library cache pin                            1     .01       2.634

Yikes. We can see that one session holding library cache latch in 95.5% of samples. That seems bit odd.

Of course, there are many latch children for library cache latches. Is that session repeatedly acquiring just one latch children? Or is it acquiring many different latch children? More importantly what is that session executing ?

Latchprofx to the rescue again. Latchprofx can also print latch children address. Re-executing latcprofx script with modified second argument to the call. Only difference between these two calls is that first argument to latchprofx includes laddr now.


@/tmp/latchprofx  sid,name,laddr % library 10000

   SID NAME                                LADDR                  Held  Held %     Held ms
------ ----------------------------------- ---------------- ---------- ------- -----------
   140 library cache                       0000000990FEFA18        298    2.98     783.442
   140 library cache                       0000000990FED2B8        270    2.70     709.830
   140 library cache                       0000000990FEF298        254    2.54     667.766
   140 library cache                       0000000990FEF658        254    2.54     667.766
   140 library cache                       0000000990FEECF8        240    2.40     630.960
   140 library cache                       0000000990FF0198        236    2.36     620.444
   140 library cache                       0000000990FEE488        225    2.25     591.525
   140 library cache                       0000000990FEFCE8        223    2.23     586.267
   140 library cache                       0000000990FEEED8        222    2.22     583.638
   140 library cache                       0000000990FEF388        222    2.22     583.638
   140 library cache                       0000000990FEDEE8        220    2.20     578.380
   140 library cache                       0000000990FF00A8        216    2.16     567.864
   140 library cache                       0000000990FEE758        215    2.15     565.235
   140 library cache                       0000000990FEEFC8        212    2.12     557.348
   140 library cache                       0000000990FEE938        211    2.11     554.719
   140 library cache                       0000000990FEE668        205    2.05     538.945
   140 library cache                       0000000990FEDB28        201    2.01     528.429
....

Whoa! This session was acquring nearly all latch children. Notice that latch addresses are different in every row output. Meaning, various samples of latchprofx saw that this session is holding a different latch children. In a nutshell, this session was acquiring and releasing ALL library cache latches ( aka children) in quasi sequential fashion and causing massive performance issue. There are 53 latch children for library cache latch (in this database). Column held% is approximately 2%, indicating all library cache latch children were acquired and released in a quasi-uniform distributed(2% times 50), quasi-cyclical fashion! Same pattern repeates if I reduce or increase samples.

Latch greedy session

Initially, I thought, this is a bug. I have not seen a case where all library cache latch children were needed to execute a SQL or a process causing massive latch contention. [ Okay, I confess, I have simulated that kind of issue in my test database holding library cache latch children using oradebug kslgetl stuff, but that is a different topic altogether]. Session 140 was executing a SQL printed below. This SQL is querying execution plan from shared_pool for a specific hash_value in this 9i database( Equivalent to dbms_xplan.display_cursor stuff in 10g.]

select plan_table_output from TABLE( dbms_xplan.display ( 'backup.dynamic_plan_table',
                (select rawtohex(address)||'_'||child_number x from v$sql where hash_value=&hv ), 'serial' ) )
/

Turned on sqltrace on that session and tkprof showed that recursive SQL generated from dbms_xplan package as a costly SQL.

REM Removing few lines to improve readability.
SELECT /* EXEC_FROM_DBMS_XPLAN */ id, position, level , operation, options, object_name , cardinality, bytes, temp_space,
        cost, io_cost, cpu_cost ,decode(partition_start, 'ROW LOCATION',
....
    		from BACKUP.DYNAMIC_PLAN_TABLE start  with id = 0
            and  timestamp >=
		(select max(timestamp) from BACKUP.DYNAMIC_PLAN_TABLE where id=0  and
			statement_id = '00000009BB9B6D50_0' and nvl(statement_id, ' ') not like 'SYS_LE%')
            and  nvl(statement_id, ' ') not like 'SYS_LE%' and statement_id = '00000009BB9B6D50_0' connect by (prior id = parent_id
                      and prior nvl(statement_id, ' ') =
                                nvl(statement_id, ' ')
                      and prior timestamp <= timestamp)
                  or (prior nvl(object_name, ' ') like 'SYS_LE%'
                      and  prior nvl(object_name, ' ') =
                                 nvl(statement_id, ' ')
                      and id = 0 and prior timestamp <= timestamp)
          order siblings by id
/

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.03       0.02          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       30     53.79      55.90        423     375837          0          29
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       32     53.82      55.93        423     375837          0          29

X$KQLFXPL

55 seconds to execute this SQL? That seems bit excessive. This SQL is the only time-consuming recursive SQL generated from the select statement. So, somehow, this must be causing latch contention. I will print only relevent lines from the plan output.

....
|   4 |     NESTED LOOPS OUTER           |             |     1 |  4479 |    21   (5)|
|   5 |      NESTED LOOPS OUTER          |             |     1 |  4475 |    20   (5)|
|*  6 |       FIXED TABLE FULL           | X$KQLFXPL   |     1 |  4444 |            | <---- FTS
|   7 |       TABLE ACCESS BY INDEX ROWID| OBJ$        |     1 |    31 |     3  (34)|
|*  8 |        INDEX UNIQUE SCAN         | I_OBJ1      |     1 |       |     2  (50)|
|   9 |      TABLE ACCESS CLUSTER        | USER$       |     1 |     4 |     2  (50)|
|* 10 |       INDEX UNIQUE SCAN          | I_USER#     |     1 |       |            |
....

   6 - filter("P"."INST_ID"=:B1 AND "P"."KQLFXPL_HADD""P"."KQLFXPL_PHAD")

Line #6 is the problem. This table x$kqlfxpl is accessed using Full table scan. Looks like, x$kqlfxpl can NOT be accessed without holding library cache latches. Since there is a FTS on this fixed table, almost every library cache object need to be touched which potentially means all library cache objects must be inspected. Further, accessing this x$ table means library cache buckets (_kgl_bucket_count controlled area) need to be walked, latches acquired and released before that memory area is inspected. Makes sense!

Repeated calls to kqlfxp, kglic, kglic0 also visible in the stack output. These calls suggests that library cache latches were acquired and released. Systemstate dumps confirms that too.

I was also able to reproduce this issue querying x$kqlfxpl in a cloned database.

SQL>  select /*+ full(a) */  count(*) from x$kqlfxpl a;

  COUNT(*)
----------
     48926

-------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |     1 |       |    18   (6)|
|   1 |  SORT AGGREGATE      |             |     1 |       |            |
|   2 |   FIXED TABLE FULL   | X$KQLFXPL   |   100 |       |            |
-------------------------------------------------------------------------

Running latchprofx script from another session concurrently shows that accessing this x$ table will result in excessive library cache latching activity. Row estimates column for this table shows 100 which is a CBO default cardinality estimate if there are no statistics on a table.


@/tmp/latchprofx  sid,name,laddr % library 10000

   SID NAME                                LADDR                  Held  Held %     Held ms
------ ----------------------------------- ---------------- ---------- ------- -----------
   127 library cache                       0000000990FED2B8         26     .26      68.120
   127 library cache                       0000000990FEFEC8         24     .24      62.880
   127 library cache                       0000000990FEF658         23     .23      60.260
   127 library cache                       0000000990FEE668         22     .22      57.640
   127 library cache                       0000000990FEFA18         22     .22      57.640
...

Background

Now, a little bit background is necessary. What started all this? As a part of 10g upgrade planning, Client DBAs had setup a job to capture costly SQLs and store execution plans from shared_pool using dynamic_plan_table script, borrowed from Tom kyte dynamic_plan_table discussion. Excellent idea and wonderfully written sql, Tom.

If you don’t know already, Tom’s method creates following view. Client script passes the hash_value of costly SQLs from v$sql to this view.

create or replace view dynamic_plan_table
    as
    select
     rawtohex(address) || '_' || child_number statement_id,
     sysdate timestamp, operation, options, object_node,
     object_owner, object_name, 0 object_instance,
     optimizer,  search_columns, id, parent_id, position,
     cost, cardinality, bytes, other_tag, partition_start,
     partition_stop, partition_id, other, distribution,
   cpu_cost, io_cost, temp_space, access_predicates,
   filter_predicates
    from v$sql_plan
;

REM there is a wrapper script calling following SQL, in a loop, for top 20 SQL statements at a stretch.
REM Of course, there is more logic in that script.

select plan_table_output from TABLE( dbms_xplan.display ( 'backup.dynamic_plan_table',
                (select rawtohex(address)||'_'||child_number x from v$sql where hash_value=&hv ), 'serial' ) )
/

Unfortunately, such an awesome method has side effects in version 9.2.0.8, as explained above.

At this point, I am almost sure that we need to collect statistics on fixed tables to resolve this issue. There is a fixed index on x$kqlfxpl (KQLFXPL_HADD, KQLFXPL_PHAD,KQLFXPL_HASH) and if that index is used, effect of this problem should disappear. Since there is a plan to upgrade to 10g soon, client didn’t want to collect fixed statistics.

So, I worked around the issue. Not an elegant solution, but it avoids problem by recreating the view with hardcoded hash value for every loop.

create or replace view backup.dynamic_plan_table
    as
    select
     rawtohex(address) || '_' || child_number statement_id,
     sysdate timestamp, operation, options, object_node,
     object_owner, object_name, 0 object_instance,
     optimizer,  search_columns, id, parent_id, position,
     cost, cardinality, bytes, other_tag, partition_start,
     partition_stop, partition_id, other, distribution,
   cpu_cost, io_cost, temp_space, access_predicates,
   filter_predicates
    from v$sql_plan
   where hash_value=&&hv    -- Added this line..
;

select plan_table_output from TABLE( dbms_xplan.display ( 'backup.dynamic_plan_table',
                (select rawtohex(address)||'_'||child_number x from v$sql where hash_value=&&hv ), 'serial' ) )
/

Now, it is evident that query runs in 0.03 seconds and also used fixed index.

     51     NESTED LOOPS OUTER (cr=261 r=0 w=0 time=1327 us)
     51      NESTED LOOPS OUTER (cr=131 r=0 w=0 time=892 us)
     51       FIXED TABLE FIXED INDEX X$KQLFXPL (ind:3) (cr=0 r=0 w=0 time=496 us)
     27       TABLE ACCESS BY INDEX ROWID OBJ$ (cr=131 r=0 w=0 time=327 us)
     27        INDEX UNIQUE SCAN I_OBJ1 (cr=104 r=0 w=0 time=212 us)(object id 33)
     27      TABLE ACCESS CLUSTER USER$ (cr=130 r=0 w=0 time=359 us)
     27       INDEX UNIQUE SCAN I_USER# (cr=28 r=0 w=0 time=78 us)(object id 11)

Finally

10g upgrade guide recommends collecting fixed table statistics. So, this issue should completely disappear in 10g.

January 2, 2009

Cardinality feedback to resolve a Cache buffers chains latch contention issue

Earlier, I blogged about resolving cache buffers chains latch contention in my earlier entry , in which, root cause was excessive index access due to Nested Loops join. Recently, we resolved another similar issue.

Problem

CPU usage was very high in production database server in user% mode. Dynamic performance view v$session_wait indicated excessive waits for latch contention. Output from a script wait_details.sql shows that many sessions were waiting for ‘latch free’ event. Also, address for these latch children are the same, meaning all these sessions are trying to access one latch children.


SQL> @wait_details

   SID PID     EVENT         USERNAME  STATE               WAIT_TIME   WIS P1_P2_P3_TEXT
------ ------- ------------- --------- ---------- ------------------- --------- ----- -----------------------------------
    91  24242  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 0
   101   4884  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 0
   116  23899  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 0
   187  19499  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 0
   108  23498  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 3
   194  23701  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 0
   202  26254  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 4
   220  23274  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 0
   227  23643  latch free    CSTMOP    WAITED KNOWN TIME           2     0 address 69476807024-number 98-tries 0
   331  26519  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 0
   297  23934  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 3
....

We can identify SQL causing latch contention querying v$session_wait. From the output below, SQL with hash_value 1509082258 is suspicious since there are many sessions executing that SQL and waiting / waited recently for ‘latch free’ event.


select substr(w.event, 1, 28 ) event, sql_hash_value, count(*)
from v$session_wait w, v$session s, v$process p
where s.sid=w.sid
and p.addr = s.paddr
and s.username is not null
and event not like '%pipe%'
and event not like 'SQL*%'
group by substr(w.event, 1, 28), sql_hash_value
;

EVENT                          SQL_HASH_VALUE   COUNT(*)
------------------------------ -------------- ----------
enqueue                               3740270          1
enqueue                             747790152          1
enqueue                            1192921796          1
latch free                          622474477          3
latch free                         1509082258         58 <---
latch free                         1807800540          1
global cache null to x                3740270          1
global cache null to x             1473456670          1
global cache null to x             3094935671          1
db file sequential read             109444956          1

Mapping to object_name
Keep Reading

December 19, 2008

Correlation, nocorrelation and extended stats

Filed under: 11g, CBO, Performance tuning — orainternals @ 12:05 am

I blogged about extended stats in my earlier blog, extended stats, and also documented that as an investigation in Investigations: extended stats and multi-column correlation. I was testing extended stats further and ran in to some interesting situations.

Extended stats can be used to store correlation between columns. Correlation between two columns needs to detect at least, two properties of the column values:

  1. Correlated column values
  2. Uncorrelated column values

Let’s explore this further.
Test case
Keep Reading

July 30, 2008

Tuning latch contention: Cache buffers chain latches

Recently, I had an opportunity to tune latch contention for cache buffers chain (CBC) latches. Problem statement is that high CPU usage combined with poor application performance. Quick review of statspack report of 15 minutes showed a latch free wait as top event and consuming 3600 seconds approximately, in a 8 CPU server. Further CPU usage was quite high, which is a typical symptom of latch contention, due to spinning involved. v$session_wait showed that hundreds of sessions were waiting for latch free event.

SQL> @waits10g

   SID PID     EVENT         P1_P2_P3_TEXT
------ ------- ------------  --------------------------------------
   294  17189  latch free    address 15873156640-number 127-tries 0
   628  17187  latch free    address 15873156640-number 127-tries 0
....
   343  17191  latch free    address 15873156640-number 127-tries 0
   599  17199  latch: cache  address 17748373096-number 122-tries 0
               buffers chains
   337  17214  latch: cache  address 17748373096-number 122-tries 0
               buffers chains
.....
   695  17228  latch: cache  address 17748373096-number 122-tries 0
               buffers chains
....
   276  15153  latch: cache  address 19878655176-number 122-tries 1
               buffers chains

We will use two pronged approach to find root cause scientifically. First, we will find SQL suffering from latch contention and objects associated with access plan for that SQL. Next, we will find buffers involved in latch contention, map that back to objects. Finally, we will match these two techniques to pinpoint root cause.

Before we go any further, let’s do a quick summary of internals of latch operations.

Brief Introduction to CBC latches and not-so-brief reason why this is a complicated topic to discuss briefly

Latches are internal memory structures to coordinate access to shared resources. Locks aka enqueues are different from latches. Key difference is that enqueues, as name suggests, provides a FIFO queueing mechanisms and latches do not provide a queueing mechanism. On the other hand, latches are held very briefly and locks are usually held longer.

In Oracle SGA, buffer cache is the memory area data blocks are read in to, aka buffer cache. [If ASMM - Automatic Shared Memory Management is in use, then part of Shared pool can be tagged as KGH:NO ALLOC and remapped to buffer cache area too].

Each buffer in the buffer cache has an associated element the buffer header array, externalized as x$bh. Buffer headers keeps track of various attributes and state of buffers in the buffer cache. This Buffer header array is allocated in shared pool. These buffer headers are chained together in a doubly linked list and linked to a hash bucket. There are many hash buckets (# of buckets are derived and governed by _db_block_hash_buckets parameter). Access (both inspect and change) to these hash chains are protected by cache buffers chains latches.

Further, buffer headers can be linked and delinked from hash buckets dynamically.

Simple algorithm to access a buffer is: (I had to deliberately cut out so as not to deviate too much from our primary discussion.)

  1. Hash data block address (DBA: Combination of tablespace, file_id and block_id) to find hash bucket.
  2. Get latch protecting hash bucket.
  3. If (success) then Walk the hash chain reading buffer headers to see if a specific version of the block is already in the chain.
  4. If found, access the buffer in buffer cache, with protection of buffer pin/unpin actions.
    If not found, then find a free buffer in buffer cache, unlink the buffer header for that buffer from its current chain, link that buffer header with this hash chain, release the latch and read block in to that free buffer in buffer cache with buffer header pinned.

  5. If (not success) spin for spin_count times and go to step 2.
  6. If this latch was not got with spinning, then sleep, with increasing exponential back-off sleep time and go to step 2.

Obviously, latches are playing crucial role controlling access to critical resources such as hash chain. My point is that repeated access to few buffers can increase latch activity.

There are many CBC latch children (derived by size of buffer cache). Parameter _db_block_hash_latches control # of latches and derived based upon buffer cache size. Further, In Oracle 10g, sharable latches are used and inspecting an hash chain needs to acquire latches in share mode, which is compatible with other shared mode operations. Note that these undocumented parameters are usually sufficient and changes to these parameters must get approval from Oracle support.

Back to our problem…

Let’s revisit our problem at hand. Wait graph printed above shows that this latch contention is caused by two types of latches. Latch # 127 is simulator lru latch and #122 is cache buffers chains latch.
Keep Reading

April 15, 2008

how to tune SQL statements scientifically

Filed under: CBO — orainternals @ 9:51 pm

I wrote a blog for pythian group blog and this is a place holder entry for the pdf file.

You can find original blog entry at
how to tune sql statements scientifically

how to tune sql statements scientifically – pdf

March 21, 2008

Correlation between column predicates

Filed under: CBO — orainternals @ 3:00 pm

Multi column correlation

We all have encountered this situation many times before: Cost based optimizer assumes no correlation between two columns (until 11g) and this has the effect of reducing cardinality of a row source erroneously. Incorrect cardinality estimates
are one of the many root causes for SQL performance issues.

Consider following example.

Following code creates a table and populates data with four columns. First two columns n1 and n2
has

create table t_vc as
select mod(n, 100) n1, mod(n, 100) n2 ,
mod(n, 50) n3 , mod(n, 20) n4
from (select level n from dual connect by level <= 10001);

There is a strong correlation between n1 and n2 above. N1 is always equals to N2.
Let’s collect statistics with histograms on all columns.
begin
dbms_stats.gather_Table_stats( user, ‘T_VC’, estimate_percent => null, method_opt => ‘for all columns size 254′);
end;
/

Let’s explain plan for query specifying one predicate. Exactly 100 rows with a value of 10.

explain plan for select count(*) from t_vc where n1=10;


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     9   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T_VC |   100 |   300 |     9   (0)| 00:00:01 |
---------------------------------------------------------------------------

Yes, optimizer is able to estimate cardinality to be 100 from row source T_VC at step 2.

Now, let’s add another predicate n2=10. Since there is a strong correlation between n1 and n2,
adding a predicate such as n2=10 should not alter row source cardinality. Right?

explain plan for select count(*) from t_vc where n1=10 and n2=10;

select * from table(dbms_xplan.display);


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     6 |     9   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     6 |            |          |
|*  2 |   TABLE ACCESS FULL| T_VC |     1 |     6 |     9   (0)| 00:00:01 |
--------------------------------------------------------------------------

Nope, Optimizer cardinality is far off and 100 times off the correct cardinality.

Notice rows column at step 2 above and it is just 1. Essentially, CBO estimate boils down to
simplified formula:
# of rows~= total # of rows * (1/NDV for n1) * (1/NDV for n2)
= 10000 * (1/100) * (1/100) =1 row.
where selectivity is (1/NDV) in simple case, without considering histograms.

Far from the truth! There are at least 100 rows returned for this SQL.

Extended stats

Oracle 11g introduces extended stats to relieve some pain. In 11g, an extended stats can be added between columns, enabling CBO to consider correlation between these column values.


SELECT dbms_stats.create_extended_stats(
ownname=>user, tabname => ‘T_VC’,
extension => ‘(n1, n2)’ ) AS n1_n2_correlation
FROM dual;
N1_n2_correlation
—————————————————
SYS_STUBZH0IHA7K$KEBJVXO5LOHAS

Let’s collect stats again on this table and check the SQL plan.
exec dbms_stats.gather_Table_stats( user, ‘T_VC’, estimate_percent => null, method_opt => ‘for all columns size 254′);

explain plan for select count(*) from t_vc where n1=10 and n2=10;


--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time               |
--------------------------------------------------------------------------
| 0  | SELECT STATEMENT |        | 100 | 1200 | 9 (0)| 00:00:01          |
|* 1 | TABLE ACCESS FULL| T_VC   | 100 | 1200 | 9 (0)| 00:00:01
--------------------------------------------------------------------------

At last, optimizer truly understood the correlation in 11g.

Under the wrap

1. Adding an extended stats adds a new virtual column to the table. Here is the line from sqltrace.

alter table "CBQT"."T_VC" add (SYS_STUBZH0IHA7K$KEBJVXO5LOHAS
       as (sys_op_combined_hash(n1, n2)) virtual BY USER for statistics);

Virtual column name is cryptic and seems to have been derived from table_name, column name combinations.

2. A new deterministic hash function sys_op_combined_hash called by optimizer to populate
this virtual column values. This function returns unique value for unique combination of values passed.

col h1 format 99999999999999999999999999
select sys_op_combined_hash (1,1) h1 from dual;
H1
—————————-
7026129190895635777

select sys_op_combined_hash (1,2) h1 from dual;
H1
—————————-
298332787864732998

Collecting histograms on all columns collects histograms on this virtual column also.

Trace lines

Using the above histogram, CBO is able to find that there is a strong correlation between these two columns.

This is visible in the 10053 output.


SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T_VC[T_VC]
  Column (#1):
    NewDensity:0.005000, OldDensity:0.000050 BktCnt:10001, PopBktCnt:10001, PopValCnt:100, NDV:100
  Column (#2):
    NewDensity:0.005000, OldDensity:0.000050 BktCnt:10001, PopBktCnt:10001, PopValCnt:100, NDV:100
  Column (#5):
    NewDensity:0.005000, OldDensity:0.000050 BktCnt:10001, PopBktCnt:10001, PopValCnt:100, NDV:100
  ColGroup (#1, VC) SYS_STUBZH0IHA7K$KEBJVXO5LOHAS
    Col#: 1 2     CorStregth: 100.00
  ColGroup Usage:: PredCnt: 2  Matches Full: #0  Partial:  Sel: 0.0100
  Table: T_VC  Alias: T_VC
    Card: Original: 10001.000000  Rounded: 100  Computed: 100.00  Non Adjusted: 100.00
  Access Path: TableScan
    Cost:  9.11  Resp: 9.11  Degree: 0
      Cost_io: 9.00  Cost_cpu: 2404620
      Resp_io: 9.00  Resp_cpu: 2404620
  Best:: AccessPath: TableScan
         Cost: 9.11  Degree: 1  Resp: 9.11  Card: 100.00  Bytes: 0

Notice the colgroup line and corStrength field above. It is set to 100. corStrength is calcualted
using histograms for virtual column and Final cardinality estimates are multipled by corStrength.

# of rows~= total # of rows * (1/NDV for n1) * (1/NDV for n2)*corStrength
= 10000 * (1/100) * (1/100)*100 =100 rows.

Cardinality estimates are matching with reality, at last.

In the next section, we will discuss this further ;-)

Well, okay, it is time to introduce myself. I specialize in Oracle performance tuning,
oracle internals and E-business suite. I have over 15 years of experience as an Oracle DBA, currently,
I am employed with pythian [ Thanks Paul]. I am regular presentor in major conferences such as
hotsos, ukoug, rmoug etc. I am also an Oaktable member.

Some of my papers can be found in my personal blog
my papers and presentations

Blog at WordPress.com.