Oracle database internals by Riyaj

September 13, 2009

Is plan_hash_value a final say?

I was reviewing a performance issue with a client recently. Problem is that increased global cache waits causing application slowdown affecting few critical business functions. Using one of my script gc_traffic.sql > and graphing the results with Excel spreadsheet, it is established that there is a marked increase in GC traffic today compared to week earlier. Similar jobs runs every day and so comparing two week days is sufficient to show the GC traffic increase. Graph is between total blocks and AWR snap time in 30 minutes interval. [Click the picture below to review the graph clearly.]

Identifying the object creating this increased GC traffic is essential to identify root cause. We were able to quickly determine that this increase in GC traffic was localized around few SQL statements using ADDM and AWR reports. We decided to focus on one SQL with an obvious increase in elapsed time compared to prior week. So, first question asked, is there a change in the plan? plan_hash_value was reviewed and quickly determined that there is no change in the plan_hash_value.

Little bit of history, there were few changes performed by the vendor over the weekend as a part of few bug fixes. Vendor’s argument was that since there is no change to the plan_hash_value, SQL access plan did not change and so, this can’t be due to vendor changes. My Client’s argument was that there were no changes to the environment and problem started after the application changes.

There are many different things that can go wrong without changes to the execution plan. We can ignore those conditions for now since (a) there has been no changes to the environment (b) no visible changes to the data (c) no error message and average CR recv time is consistent with prior weeks. Well, Let’s cut to the chase. It boiled down to a question “Can SQL plan change without a change in plan_hash_value?”. What do you think? Please answer in the poll below.

plan_hash_value and hash_value

Hash_value of a SQL statement is generated from the text of an SQL statement and plan_hash_value is generated from the execution plan of that SQL statement[ More accurately, from that child cursors' execution plan and exactly what is involved in generating plan_hash_value is not published]. It is a general belief that plan_hash_value will change even if there is a slightest change in the execution plan. But, that is not always the case!

Test case

We will use a small table to explore this issue.

prompt Test case #1: Initial test case with index on columns (n1, n2).
prompt ==========
create table t1 (n1 number, n2 number, v1 varchar2(100));
create index t1_n1 on t1(n1,n2);

explain plan for select * from t1 where n1=:b1 and n2=:b2;
select * from table (dbms_xplan.display);
prompt plan hash value in this case is 626762252

Plan hash value: 626762252
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    53 |  4134 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |    53 |  4134 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_N1 |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("N1"=TO_NUMBER(:B1) AND "N2"=TO_NUMBER(:B2))

So, we got the plan_hash_value as highlighted above. In the following test case we will recreate the index reordering the columns as (n2, n1).

prompt Test case #2: Index with re-ordered columns
prompt ==========
drop index t1_n1;
create index t1_n1 on t1(n2,n1);

Plan hash value: 626762252

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    53 |  4134 |     9   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |    53 |  4134 |     9   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_N1 |  5298 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("N2"=TO_NUMBER(:B2) AND "N1"=TO_NUMBER(:B1))

Notice that in the test result above filter predicate changed reflecting index column reordering. But the plan_hash_value did not change. Point is that execution plan can change without a change in plan_hash_value (due to change in the underlying tables).

Let’s modify that index dropping a column from the index in the next test case.

prompt Test case #3: dropping a column from index t1_n1.

drop index t1_n1;
create index t1_n1 on t1(n1);

explain plan for select * from t1 where n1=:b1 and n2=:b2;

Plan hash value: 626762252

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    53 |  4134 |    41   (3)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| T1    |    53 |  4134 |    41   (3)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_N1 |  2119 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("N2"=TO_NUMBER(:B2))
   2 - access("N1"=TO_NUMBER(:B1))

We see that predicates changed but the plan_hash_value did not change. In the test case below, we will modify index to be a function based index and test this SQL statement. There are also few more self-evident test cases below.

prompt Test case #3: Index is a function based index. Still, no change in the plan_hash_value.

create index t1_n1 on t1(to_char(n1));
explain plan for select * from t1 where to_char(n1)=:b1 and n2=:b2;

Plan hash value: 626762252
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    53 |  4134 |   127   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| T1    |    53 |  4134 |   127   (1)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_N1 |  2119 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("N2"=TO_NUMBER(:B2))
   2 - access(TO_CHAR("N1")=:B1)

prompt Test case #4: Different schema and same SQL. plan_hash_value did not change.

Plan hash value: 626762252
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |    78 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |    78 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_N1 |     3 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("N2"=TO_NUMBER(:B2))
   2 - access("N1"=TO_NUMBER(:B1))

plan_hash_value is also case sensitive for index names. In the test cases below, we will create case sensitive indices.

prompt Test case #5: Lower case index_name.. plan_hash_value changed.
create index "t1_n1" on t1(n1,n2);
explain plan for select * from t1 where n1=:b1 and n2=:b2;

Plan hash value: 2252949961
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    53 |  4134 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |    53 |  4134 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | t1_n1 |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

prompt Test case #6: Upper case index_name.. plan_hash_value did not change.
create index "T1_N1" on t1(n1,n2);
explain plan for select * from t1 where n1=:b1 and n2=:b2;
Plan hash value: 626762252

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    53 |  4134 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |    53 |  4134 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_N1 |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

prompt Test case #7: Space in the index name changes plan_hash_value though.
create index "T1_N1 " on t1(n1,n2);
explain plan for select * from t1 where n1=:b1 and n2=:b2;
Plan hash value: 1377541522
--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |    53 |  4134 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1     |    53 |  4134 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_N1  |     1 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Summary

In summary, plan_hash_value is a very good indicator to see if the plan changed or not. Don’t get me wrong, I also use plan_hash_value, but in addition to comparing CPU time and elapsed time. Execution plan can change even when there is no change to the plan_hash_value. Salient points from these test cases are:

  1. Plan_hash_value is dependent upon partial execution plan not on complete execution plan.
  2. If a predicate is moved from filter_predicate to access_predicate or vice-versa, it doesn’t affect plan_hash_value.
  3. Changes in the parallelism of the queries does not affect plan_hash_value. For example, if a query used 4 parallel slaves today and 16 parallel slaves yesterday, that change is not visible through plan_hash_value. This behavior is expected as parallel_adaptive_multiuser can allocate slaves depending upon the state of that instance.
  4. Plan_hash_value is case-sensitive to index/table names. Also sensitive to white-space characters.
  5. Plan_hash_value is not sensitive to index types. For example, if the index type is a function based index as in our test case #4, as long as, index name did not change plan_hash_value will remain the same.
  6. Plan_hash_value is not sensitive to schema either. SQL statement accessing different schemas can have same plan_hash_value too.

Back to our problem. One of the index was recreated removing a column and caused optimizer to apply filter predicates at table level increasing number of accesses to table block tremendously, leading to more logical reads, more Global cache waits etc. This problem was amplified since this SQL was executed very frequently and concurrently from all RAC instances.

In my client’s defense, this application change was tested thoroughly. But, alas, test data chosen for this performance test was not probing this specific issue. This performance issue did not show up in development. Essentially, chosen data in the performance benchmark suite was not an excellent choice.

As they say in Gaelic language “Go raimh maith agat” to my client for allowing me to post this blog.
This can be read in traditional format in
plan_hash_value_and_gc
Update 1: Added the document and also corrected a typo.

February 13, 2009

RMOUG Training Days 2009

Filed under: Uncategorized — orainternals @ 8:13 pm

I had a privilege of presenting two papers in RMOUG Training days 2009 for the past couple of days. It is always been a valuable conference to attend considering quality of presentations. This proved to be true and was an excellent conference this year too.

During my presentation, I promised that I will upload recent copy of my papers and presentations in my blog and here they are:

  1. Performance features 11g ppt
  2. Performance features 11g paper
  3. Battle of the nodes RAC performance myths presentation
  4. Battle of the nodes RAC performance myths paper

I also met many awesome presenters such as mogens nørgaard , Stephen Haisley, Jeff Needham, Gaja , Cary Millsapp , Daniel Fink , Danial Morgan , Joze Senegacnik , Carol Dacko, Dan Norris, Tanel , Jeremiah Wilton, Kevin Closson to name a few. Apologies, if I missed anybody, many valuable presentations and highly knowledgeable presenters attended this conference.

I attended few presentations:
A valuable presentation by Jeremiah Wilton, on breaking Oracle. Especially, when he created ORA-600 errors with “Ouch” and other funny remarks as arguments, whole room went in to laughter. This presentation can be downloaded from his blog .

Tanel had an interesting presentation about ‘How execution plan works?”. He explained, internally, how various row sources are simply calls to various function calls etc. Very valuable presentation indeed. Tanel said that he will upload his presentations to his blog soon.

Joze, from wonderful Slovenia, gave a presentation about SQL Plan Management. This is a very important new feature and will be extremely useful for plan stability.

Of course, I had few interesting offline and thought provoking discussions with many others. Stephen Haisley shared some new features about streams and it will be great to see that one day. I had a length conversation with Kevin Closson about NUMA, CPU cache line and his days with NUMA engineering and development (in sequent).

September 11, 2008

How to find objects creating nologging changes?

In an Oracle-l thread, a question was raised: How to find objects creating nologging changes?

What is a ‘nologging’ change?

Redo logging mechanism plays critical role in media recovery. Media recovery relies on archivelog files generated to roll forward the database. Standby database or dataguard recovery also relies on archivelog files. It is possible to do DML changes with minimal logging, aka nologging changes or direct mode inserts. For example, insert /*+ append */ can be used to populate the rows in to a table without generating much redo. This can invalidate standby database and might trigger rebuilding some or all parts of the standby database.

Nologging changes generates minimal redo, since the blocks are pre-formatted and written to disk directly. A redo record is generated invalidating a range of affected blocks. This invalidation redo record size is far smaller, for e.g. hundreds of blocks can be invalidated using just a single redo record. Of course, recovery is severely affected as the changes performed with nologging operations can NOT be reapplied / recovered.

Internals of nologging changes

Since nologging is all about redo records, dumping redo log file or archivelog file is a concrete way to see what happens under the hood. Let’s consider an example to explain internals of nologging changes.

We will create a table, insert rows, and review redo records closely.

  create table t4 ( a number) nologging tablespace users;
  REM switch log file so that we wil have smaller log files to dump.
  alter system switch logfile;
  REM Direct mode or nologging inserts
  insert /*+ append */ into t4 select object_id from dba_objects;
  commit;
  alter system switch logfile;

Prior online redo log file contains nologging changes from table t4 above. We could dump that log file in its entirety, but we will dump just a layer specific to direct mode changes alone to limit trace file size. Layer 19 is for nologging changes. Following script will dump last online redo log file for layer 19.

 set serveroutput on size 1000000
 declare
   v_sqltext varchar2(255);
 begin
   -- Find name of latest but one log file.
   select 'alter system dump logfile '||chr(39)||
                member||chr(39) || ' layer 19 '
   into v_sqltext
   from
   v$log  lg, v$logfile lgfile
   where lg.group# = lgfile.group# and
   lg.sequence# =
       (select sequence#-1 from v$log where status='CURRENT' )
   and rownum <2;
   dbms_output.put_line ('Executing :'||v_sqltext);
   --Execute above SQL to dump log file and print it to see.
   execute immediate v_sqltext;
  end;
/

alter system dump logfile 'D:\ORACLE\ORADATA\ORCL11G\REDO02.LOG' layer 19

Above command generated a trace file in user_dump_dest directory and few redo records from that trace file printed below:

...
REDO RECORD - Thread:1 RBA: 0x0000b0.0000000f.00d4 LEN: 0x0034 VLD: 0x01
SCN: 0x0000.00486397 SUBSCN:  1 09/03/2008 10:54:38
CHANGE #1 INVLD AFN:4 DBA:0x01038824 BLKS:0x0001 OBJ:72852 SCN:0x0000.00486397 SEQ:  1 OP:19.2
Direct Loader invalidate block range redo entry

REDO RECORD - Thread:1 RBA: 0x0000b0.00000014.0118 LEN: 0x0034 VLD: 0x01
SCN: 0x0000.0048639a SUBSCN:  1 09/03/2008 10:54:38
CHANGE #1 INVLD AFN:4 DBA:0x01038832 BLKS:0x0001 OBJ:72852 SCN:0x0000.0048639a SEQ:  1 OP:19.2
Direct Loader invalidate block range redo entry
...

Explanation of redo records

Let’s review following two lines.

CHANGE #1 INVLD AFN:4 DBA:0x01038832 BLKS:0x0001 OBJ:72852 SCN:0x0000.0048639a SEQ:  1 OP:19.2
Direct Loader invalidate block range redo entry

OP:19.2 indicates that layer is 19 for this change vector and 2 is the opcode. This vector also specifies that BLKS=1 and DBA as 1038832. Essentially this change vector says “Invalidate range of blocks starting at block DBA 0×01038832 for 1 block”. This could be for a range of blocks too.

For nologging changes a block range invalidation redo generated, block formatted, populated with rows and written directly to disk. But, standby database and media recovery rely on redo records to replay these changes. When recovery code encounters above direct loader invalidate redo record, it throws a warning to alert log and simply mark that range of blocks as invalid. Any attempt to access those blocks will throw ORA-1578 block corruption error.

Just to reiterate, standby and media recovery are affected by nologging changes. But, Current database is still fine and there is no corruption in primary database. If we backup the primary database again or rebuild that tablespace in standby database, we avoid corruption. Point is that, if there is a business need to have nologging changes, consider taking a backup immediately after nologging changes. Of course, if there is a standby database involved, then that tablespace need to be re-copied from production.

Back to our problem

We need to find the object causing nologging changes. Field OBJ:72852 shows object_id in decimal. We can query dba_objects to find object_name.

select owner, object_name from dba_objects where
object_id=72852 or data_object_id=72852
SQL> /
OWNER                          OBJECT_NAM
------------------------------ ----------
CBQT                           T4

In essence, procedure to find object_name causing nologging generation is:

    1. Find archivelog containing nologging changes. If timestamp is known, then v$archived_log can be queried to find archivelog file name.
    2. Dump archivelog file for layer 19: alter system dump logfile ‘filename’ layer 19;
    3. Locate redo record with opcode 19.2.
    4. Query dba_objects for that object_id.More information about redo internals can be found in redo internals doc

March 8, 2008

Hotsos 2008 is over

Filed under: Uncategorized — orainternals @ 6:26 am

I have been a regular presenter in hotsos performance intensive conference, three years in a row. This year my presentation was about cost based subquery transformation. Optimizer has become far smarter in 11g, and subqueries are transformed and cost calculated using physical optimizer. Enough about that.

As usual, I met many of my friends, Mark Bobak, Alex Gorbachev, Robyn Sands, Joze Sengachnik, Kyle Hailey, Jared Still, Wolfgang B, Carol Decko, Stephen Knecht, Christian Antognini and so on. We had a wonderful time on Monday in Bob’s steak house, good time on Wednesday in downtown dallas.

Now, I can’t wait for next HOTSOS!

January 9, 2007

At last, my blog is up.

Filed under: Uncategorized — orainternals @ 9:32 pm

This has been one of my long pending wishes to have a blog and update it regularly.

I am an Oracle Database Administrator and I specialize in Oracle database internals and performance tuning. My emphasis currently in Real Application Clusters and E-Business Suite.

Hopefully, I will keep this blog updated and love to hear your comments.

I have also written couple of articles, presented few and will upload them here soon.

Blog at WordPress.com.