Oracle database internals by Riyaj

May 6, 2009

COLLABORATE 2009 presentation: 11g performance new feature.

I just presented about Oracle 11g new features specific to performance in COLLABORATE 2009, Orlando Florida. You can download presentation and paper from here:

11g performance specific new features – presentation
11g performance specific new features – paper

I met couple of new friends and many familiar faces.

Catherin Devlin introduced sqlpython tool to me. This tool is an user friendly replacement for sqlplus with many UNIX like goodies. This tool will be an useful addition to command line tools. You can see installation instructions for sqlpython here.

I also attended an interesting application performance panel discussion with fellow Oakie Mark W Farnheim, Mike Brown and Sandra Vucinic. We discussed few customer issues.

An interesting problem discussion is worth mentioning. A client clones production database to development database using full rapidclone methodology. But, access plans for a SQL statement is different between production and development, even though everything was exactly the same. I am hoping, that client will send 10053 trace files from both databases and I will blog about it if I get anything from that client. I think, bind peeking is causing the plan difference, but need to analyze 10053 trace file to confirm that. Another possibility is that, may be , he is not cloning Oracle Software and some software patches are not existing in cloned environment. Need trace files to proceed further here.

Of course, I met few other friends Jeremy Schneider, Dan Norris to mention few.

February 25, 2009

High global cache waits on tab$ ?

Yes, you read it correct. That shocked me too.

I was trying to understand global cache waits for a client. Naturally, I queried statspack tables and analyzed the data for just one day using a script. Surprisingly, tab$ came as top consumer of global cache waits. I was shocked by the revelations and couldn’t believe it! If something doesn’t make sense, look more closely, right?

Global cache waits

Database version is 9i. Statspack table stats$seg_stat is the source for this script and that table is populated from v$segment_stats. So, these column values (global_cache_cu_blocks_served and global_cache_cr_blocks_served) are cumulative. To find statistics for a specific period, we need to subtract column value of prior row from current row. Analytic function lag can be useful for this.

In this analytic function printed below, partitioning clause uses instance_number, startup_time and dataobj#. All rows with same value for these three columns will be considered in one data partition and then rows ordered by snap_id within that data partition. Lag will pull the row from prior snap_id in that partition. Then, we subtract from current value to get the difference. Please refer to this paper: Performance tuning with SQL new features – paper for more information about analytic functions.

  ...
   global_cache_cu_blocks_served -
   lag(global_cache_cu_blocks_served,1,0) over (partition by instance_number,startup_time, dataobj#, obj#
              order by  snap_id ) global_cache_cu_blocks_served,
...

Script and output

Complete script printed below and running that against a database.

Continue Reading

January 2, 2009

Cardinality feedback to resolve a Cache buffers chains latch contention issue

Earlier, I blogged about resolving cache buffers chains latch contention in my earlier entry , in which, root cause was excessive index access due to Nested Loops join. Recently, we resolved another similar issue.

Problem

CPU usage was very high in production database server in user% mode. Dynamic performance view v$session_wait indicated excessive waits for latch contention. Output from a script wait_details.sql shows that many sessions were waiting for ‘latch free’ event. Also, address for these latch children are the same, meaning all these sessions are trying to access one latch children.


SQL> @wait_details

   SID PID     EVENT         USERNAME  STATE               WAIT_TIME   WIS P1_P2_P3_TEXT
------ ------- ------------- --------- ---------- ------------------- --------- ----- -----------------------------------
    91  24242  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 0
   101   4884  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 0
   116  23899  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 0
   187  19499  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 0
   108  23498  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 3
   194  23701  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 0
   202  26254  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 4
   220  23274  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 0
   227  23643  latch free    CSTMOP    WAITED KNOWN TIME           2     0 address 69476807024-number 98-tries 0
   331  26519  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 0
   297  23934  latch free    CSTMOP    WAITING                     0     0 address 69476807024-number 98-tries 3
....

We can identify SQL causing latch contention querying v$session_wait. From the output below, SQL with hash_value 1509082258 is suspicious since there are many sessions executing that SQL and waiting / waited recently for ‘latch free’ event.


select substr(w.event, 1, 28 ) event, sql_hash_value, count(*)
from v$session_wait w, v$session s, v$process p
where s.sid=w.sid
and p.addr = s.paddr
and s.username is not null
and event not like '%pipe%'
and event not like 'SQL*%'
group by substr(w.event, 1, 28), sql_hash_value
;

EVENT                          SQL_HASH_VALUE   COUNT(*)
------------------------------ -------------- ----------
enqueue                               3740270          1
enqueue                             747790152          1
enqueue                            1192921796          1
latch free                          622474477          3
latch free                         1509082258         58 <---
latch free                         1807800540          1
global cache null to x                3740270          1
global cache null to x             1473456670          1
global cache null to x             3094935671          1
db file sequential read             109444956          1

Mapping to object_name
Keep Reading

Blog at WordPress.com.