Oracle database internals by Riyaj

Discussions about Oracle performance tuning, RAC, Oracle internal & E-business suite.

Accessing HCC compressed objects using an index

Posted by Riyaj Shamsudeen on June 7, 2016

Problem

I came across another strange SQL performance issue: Problem was that a SQL statement was running for about 3+ hours in an User Acceptance (UA) database, compared to 1 hour in a development database. I ruled out usual culprits such as statistics, degree of parallelism etc. Reviewing the SQL Monitor output posted below, you can see that the SQL statement has already done 6 Billion buffer gets and steps 21 through 27 were executed 3 Billion times so far.

Statistics and execution plan

Please refer to the execution plan. Due to the functionality of the SQL statement, the execution plan seems acceptable. 3 Billion executions in to the view (at step 21) is also expected, however, why would the SQL statement runs much slower in the UA database ? [ Table names are changed for security reasons. ].

Also, out of 9950 seconds of elapsed time, 9482 seconds spent on CPU. Obviously, most of the time is spent on CPU and not waiting for any other wait events.

Global Stats
============================================================================================
| Elapsed |   Cpu   |    IO    | Concurrency | Cluster  |  Other   | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) | Waits(s) |  Gets  | Reqs | Bytes |
============================================================================================
|   9950 |   9482 |     0.02 |        0.00 |     0.30 |      467 |     6G |   55 |   3MB |
============================================================================================

SQL Plan Monitoring Details (Plan Hash Value=2383560540)
=====================================================================================================================================================================================================
| Id    |                  Operation                  |              Name              |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read | Read  | Mem  | Activity | Activity Detail |
|       |                                             |                                | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs | Bytes |      |   (%)    |   (# samples)   |
=====================================================================================================================================================================================================
|     0 | INSERT STATEMENT                            |                                |         |       |           |        |     1 |          |      |       |      |          |                 |
|     1 |   LOAD TABLE CONVENTIONAL                   | TABLE_D                        |         |       |           |        |     1 |          |      |       |      |          |                 |
|     2 |    NESTED LOOPS                             |                                |         |       |           |        |     1 |          |      |       |      |          |                 |
|     3 |     NESTED LOOPS                            |                                |       1 |  179G |           |        |     1 |          |      |       |      |          |                 |
|     4 |      NESTED LOOPS                           |                                |       1 |  179G |           |        |     1 |          |      |       |      |          |                 |
|     5 |       VIEW                                  |                                |       1 |  179G |           |        |     1 |          |      |       |      |          |                 |
|     6 |        SORT GROUP BY                        |                                |       1 |  179G |           |        |     1 |          |      |       |      |          |                 |
|     7 |         HASH JOIN                           |                                |       1 |  179G |           |        |     1 |          |      |       | 213K |          |                 |
|     8 |          FILTER                             |                                |         |       |           |        |     1 |          |      |       |      |          |                 |
|  -> 9 |           HASH JOIN OUTER                   |                                |      18 |  179G |      9945 |     +6 |     1 |        0 |      |       |  10M |          |                 |
| -> 10 |            VIEW                             |                                |    1781 |  179G |      9945 |     +6 |     1 |    61848 |      |       |      |          |                 |
| -> 11 |             NESTED LOOPS                    |                                |    1781 |  179G |      9945 |     +6 |     1 |    61848 |      |       |      |     0.45 | Cpu (45)        |
| -> 12 |              MERGE JOIN                     |                                |     22G | 98170 |      9945 |     +6 |     1 |       3G |      |       |      |          |                 |
| -> 13 |               SORT JOIN                     |                                |    822K |  4810 |      9945 |     +6 |     1 |     228K |      |       |  36M |          |                 |
|    14 |                INDEX STORAGE FAST FULL SCAN | TABLE_A_PK                     |    822K |   876 |         1 |     +6 |     1 |     822K |   21 |   3MB |      |          |                 |
| -> 15 |               FILTER                        |                                |         |       |      9945 |     +6 |  228K |       3G |      |       |      |     0.64 | Cpu (64)        |
|    16 |                SORT JOIN                    |                                |   61637 |  3180 |      9942 |     +6 |  228K |       4G |      |       |   4M |    19.75 | Cpu (1962)      |
|    17 |                 VIEW                        |                                |   61637 |  1950 |         1 |     +6 |     1 |    40467 |      |       |      |          |                 |
|    18 |                  UNION-ALL                  |                                |         |       |         1 |     +6 |     1 |    40467 |      |       |      |          |                 |
|    19 |                   TABLE ACCESS STORAGE FULL | TABLE_B                        |   48439 |  1432 |         6 |     +1 |     1 |    16063 |    1 |  8192 |      |     0.02 | Cpu (2)         |
|    20 |                   TABLE ACCESS STORAGE FULL | TABLE_C                        |   13198 |   519 |         4 |     +3 |     1 |    24404 |   17 | 136KB |      |     0.01 | Cpu (1)         |
| -> 21 |              VIEW                           |                                |       1 |     8 |      9945 |     +6 |    3G |    61848 |      |       |      |     1.03 | Cpu (102)       |
| -> 22 |               UNION ALL PUSHED PREDICATE    |                                |         |       |      9945 |     +6 |    3G |    71691 |      |       |      |    12.25 | Cpu (1217)      |
| -> 23 |                FILTER                       |                                |         |       |      9945 |     +6 |    3G |    38261 |      |       |      |     1.49 | Cpu (148)       |
| -> 24 |                 TABLE ACCESS BY INDEX ROWID | TABLE_D                        |       1 |     4 |      9945 |     +6 |    3G |    38261 |      |       |      |     6.53 | Cpu (649)       |
| -> 25 |                  INDEX RANGE SCAN           | TABLE_D_PK                     |       1 |     3 |      9946 |     +5 |    3G |    38261 |    1 |  8192 |      |    27.01 | Cpu (2683)      |
| -> 26 |                TABLE ACCESS BY INDEX ROWID  | TABLE_E                        |       1 |     4 |      9945 |     +6 |    3G |    33430 |      |       |      |     5.40 | Cpu (536)       |
| -> 27 |                 INDEX RANGE SCAN            | TABLE_E_PK                     |       1 |     3 |      9947 |     +4 |    3G |    33430 |   10 | 81920 |      |    25.43 | Cpu (2526)      |
|    28 |            TABLE ACCESS STORAGE FULL        | TABLE_C                        |    737K |  8000 |           |        |       |          |      |       |      |          |                 |
|    29 |          TABLE ACCESS STORAGE FULL          | TABLE_F                        |    406K |  1011 |           |        |       |          |      |       |      |          |                 |
|    30 |       TABLE ACCESS BY INDEX ROWID           | TABLE_F                        |       1 |     3 |           |        |       |          |      |       |      |          |                 |
|    31 |        INDEX RANGE SCAN                     | TABLE_F_PERF                   |       3 |     2 |           |        |       |          |      |       |      |          |                 |
|    32 |      INDEX UNIQUE SCAN                      | TABLE_F_UK                     |       1 |       |           |        |       |          |      |       |      |          |                 |
|    33 |     TABLE ACCESS BY INDEX ROWID             | TABLE_F                        |       1 |     1 |           |        |       |          |      |       |      |          |                 |
=====================================================================================================================================================================================================

Time to use Tanel’s snapper utility to review the session level statistics.

-- Session Snapper v2.01 by Tanel Poder ( http://www.tanelpoder.com )

----------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                               ,         DELTA, HDELTA/SEC,    %TIME, GRAPH
----------------------------------------------------------------------------------------------------------------------
    875, APPUSER   , STAT, session logical reads                   ,      20864847,    695.49k,
    875, APPUSER   , STAT, consistent gets                         ,      20864658,    695.49k,
    875, APPUSER   , STAT, consistent gets from cache              ,      20864658,    695.49k,
    875, APPUSER   , STAT, consistent gets pin                     ,      20641092,    688.04k,
    875, APPUSER   , STAT, consistent gets pin (fastpath)          ,      20641091,    688.04k,
    875, APPUSER   , STAT, consistent gets examination             ,        223563,      7.45k,
    875, APPUSER   , STAT, consistent gets examination (fastpath)  ,        223563,      7.45k,
    875, APPUSER   , STAT, logical read bytes from cache           ,  170923122688,       5.7G, <-
    875, APPUSER   , STAT, no work - consistent read gets          ,      20597772,    686.59k,
    875, APPUSER   , STAT, Cached Commit SCN referenced            ,          3181,     106.03,
    875, APPUSER   , STAT, table fetch by rowid                    ,           132,        4.4,
    875, APPUSER   , STAT, table fetch continued row               ,          3639,      121.3,
    875, APPUSER   , STAT, index scans kdiixs1                     ,      20317952,    677.27k,
    875, APPUSER   , STAT, EHCC CUs Decompressed                   ,           132,        4.4,<-
    875, APPUSER   , STAT, EHCC Archive CUs Decompressed           ,           132,        4.4,
    875, APPUSER   , STAT, EHCC Compressed Length Decompressed     ,      31102232,      1.04M,
    875, APPUSER   , STAT, EHCC Decompressed Length Decompressed   ,     275203587,      9.17M,
    875, APPUSER   , STAT, EHCC Columns Decompressed               ,           660,         22,
    875, APPUSER   , STAT, EHCC Total Columns for Decompression    ,          1508,      50.27,
    875, APPUSER   , STAT, EHCC Total Rows for Decompression       ,       3016607,    100.55k,
    875, APPUSER   , STAT, EHCC Pieces Buffered for Decompression  ,          3771,      125.7,
    875, APPUSER   , STAT, EHCC Total Pieces for Decompression     ,          3956,     131.87,
    875, APPUSER   , STAT, EHCC Rowid CUs Decompressed             ,           132,        4.4,<-
    875, APPUSER   , STAT, buffer is pinned count                  ,             7,        .23,
    875, APPUSER   , STAT, buffer is not pinned count              ,        227503,      7.58k,
    875, APPUSER   , TIME, DB CPU                                  ,      31988138,      1.07s,   106.6%, |@@@@@@@@@@|
    875, APPUSER   , TIME, sql execute elapsed time                ,      32019381,      1.07s,   106.7%, |@@@@@@@@@@|
    875, APPUSER   , TIME, DB time                                 ,      32019381,      1.07s,   106.7%, |@@@@@@@@@@|

Analysis

WTF(Weird Thing Found)! 5.7GB of logical read bytes from cache per second! I didn’t expect to see EHCC statistics, associated with HCC compression/decompression operations, either. Interestingly, 4.4 EHCC Compression Units (CU) are decompressed per second, resulting in ~100K rows decompressed per second.

Querying the compression attributes of those two objects TABLE_D and TABLE_E, I realized that the segments are compressed for archive high compression. Reviewing the execution plan again with the realization that the segments are compressed, the performance issue suddenly makes sense ( I should have done CPU profiling to verify the logic, but, I was under time pressure to resolve the issue).
Here is the rewritten execution logic of step 24 and 25:

       For each ROWID from step 25 in the execution plan:
            Access TABLE_D ( Step 24). 
            Since the segment TABLE_D is compressed, 
               for each of those ROWID based accesses, decompress HCC CUs of that segment and then search the ROWID. 

This loop is repeated at least 3 Billion times leading to prolonged longer run time. It seems to me that decompressed rows are not remembered and the decompression is repeated for each ROWID (requires further confirmation though). Similar decompression cycle is used on TABLE_E also, leading to many hours of run time.

Disabling compression at the table level and reorganizing the table, resolved the performance issue. A script bug created the tables with HCC compression when the changes were propagated from DEV to UA.

Further thoughts

I have encountered this root cause (excessive decompression of HCC segments) numerous times so far, and my hope is that future database version will resolve this issue.

While I agree that in this specific case, the HCC compression is not suitable for the table, I would expect that the Oracle optimizer to do one or more of the following things:
(a) Collect all ROWIDs and then access the table segment once to access the row pieces. This will decompress the compression units just once.
(b) Or remember the decompressed rows in some sort of cache (PGA) and access that cache or in a session-specic temporary table. I realize that this plan may accidentally increase the PGA usage, however, I guess, the memory usage can be limited by other means.
(c) Or Use Hash join if the estimated rows are higher, in this case, the prior row source is estimated to return 22 Billion rows. I doubt hash join is viable for this specific execution plan though (meaning, requires further rewrite).

May be, this is a new feature in the future releases (I am not a Beta participant for 12.2 ). Problem reproduced in 12.1.0.2 database on Exadata X5-2.

PS: Better yet, I rewrote the SQL statement little bit and tuned it further after understanding the functionality of the SQL statement completely. Rewritten statement completes within a few minutes now. Some times, application knowledge comes handy to tune SQL statements🙂

3 Responses to “Accessing HCC compressed objects using an index”

  1. Anand said

    Hi Riyaj,

    Thank you once again for sharing this!!

    It might sound silly, but can you elaborate on CPU profiling. How it would be done. Any blog post to read.

    Regards,
    Anand

  2. Hi Riyaj,

    interesting – it’s probably related to the fact that the critical table lookups are part of a Nested Loop join and “Pushed Predicate” transformation. In principle for single table index range scans Oracle has introduced means to handle the HCC decompression problem more efficiently, see Jonathan’s posts about the issue:

    https://jonathanlewis.wordpress.com/2012/08/19/compression-units-5/
    https://jonathanlewis.wordpress.com/2014/02/12/caution-hints/

    So I wonder if the CLUSTER_BY_ROWID functionality only applies to the single table index access path or whether it could have somehow be enabled in your case here and helped – of course given your re-write of the query it’s no longer really relevant🙂

    Randolf

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: