<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Oracle database internals by Riyaj</title>
	<atom:link href="http://orainternals.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://orainternals.wordpress.com</link>
	<description>Discussions about Oracle performance tuning, RAC, Oracle internal &#38; E-business suite.</description>
	<lastBuildDate>Mon, 17 Jun 2013 22:00:47 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='orainternals.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Oracle database internals by Riyaj</title>
		<link>http://orainternals.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://orainternals.wordpress.com/osd.xml" title="Oracle database internals by Riyaj" />
	<atom:link rel='hub' href='http://orainternals.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Dude, where is my redo?</title>
		<link>http://orainternals.wordpress.com/2013/06/12/dude-where-is-my-redo/</link>
		<comments>http://orainternals.wordpress.com/2013/06/12/dude-where-is-my-redo/#comments</comments>
		<pubDate>Wed, 12 Jun 2013 19:12:38 +0000</pubDate>
		<dc:creator>Riyaj Shamsudeen</dc:creator>
				<category><![CDATA[11g]]></category>
		<category><![CDATA[Oracle database internals]]></category>
		<category><![CDATA[Performance tuning]]></category>
		<category><![CDATA[RAC]]></category>
		<category><![CDATA[identify objects redo]]></category>
		<category><![CDATA[redo internals]]></category>
		<category><![CDATA[segment_stats.sql]]></category>
		<category><![CDATA[v$logmnr_contents]]></category>
		<category><![CDATA[v$segment_stats]]></category>

		<guid isPermaLink="false">http://orainternals.wordpress.com/?p=1333</guid>
		<description><![CDATA[This blog entry is to discuss a method to identify the objects inducing higher amount of redo. First,we will establish that redo size increased sharply and then identify the objects generating more redo. Unfortunately, redo size is not tracked at a segment level. However, you can make an educated guess using ‘db block changes’ statistics. [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1333&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p> This blog entry is to discuss a method to identify the objects inducing higher amount of redo. First,we will establish that redo size increased sharply and then identify the objects generating more redo. Unfortunately, redo size is not tracked at a segment level. However, you can make an educated guess using ‘db block changes’ statistics. But, you must use logminer utility to identify the objects generating more redo scientifically. </p>
<p><b> Detecting redo size increase </b></p>
<p> AWR tables (require Diagnostics license) can be accessed to identify the redo size increase. Following query spools the daily rate of redo size. You can easily open the output file redosize.lst in an Excel spreadsheet and graph the data to visualize the redo size change. Use pipe symbol as the delimiter while opening the file in excel spreadsheet. </p>
<pre>
spool redosize.lst
REM  You need Diagnostic Pack licence to execute this query!
REM  Author: Riyaj Shamsudeen
col begin_interval_time format a30
set lines 160 pages 1000
col end_interval_time format a30
set colsep '|'
alter session set nls_date_format='DD-MON-YYYY';
with redo_sz as (
SELECT  sysst.snap_id, sysst.instance_number, begin_interval_time ,end_interval_time ,  startup_time,
VALUE - lag (VALUE) OVER ( PARTITION BY  startup_time, sysst.instance_number
                ORDER BY begin_interval_time, startup_time, sysst.instance_number) stat_value,
EXTRACT (DAY    FROM (end_interval_time-begin_interval_time))*24*60*60+
            EXTRACT (HOUR   FROM (end_interval_time-begin_interval_time))*60*60+
            EXTRACT (MINUTE FROM (end_interval_time-begin_interval_time))*60+
            EXTRACT (SECOND FROM (end_interval_time-begin_interval_time)) DELTA
  FROM sys.wrh$_sysstat sysst , DBA_HIST_SNAPSHOT snaps
WHERE (sysst.dbid, sysst.stat_id) IN ( SELECT dbid, stat_id FROM sys.wrh$_stat_name WHERE  stat_name='redo size' )
AND snaps.snap_id = sysst.snap_id
AND snaps.dbid =sysst.dbid
AND sysst.instance_number=snaps.instance_number
and begin_interval_time &gt; sysdate-90
)
select instance_number, 
  to_date(to_char(begin_interval_time,'DD-MON-YYYY'),'DD-MON-YYYY') dt 
, sum(stat_value) redo1
from redo_sz
group by  instance_number,
  to_date(to_char(begin_interval_time,'DD-MON-YYYY'),'DD-MON-YYYY') 
order by instance_number, 2
/
spool off
</pre>
<p>
Visualizing the data will help you to quickly identify any pattern anomalies in redo generation. Here is an example graph created from the excel spreadsheet and see that redo size increased recently.
</p>
<p><a href="http://orainternals.files.wordpress.com/2013/06/screenshot_redo.jpg"><img src="http://orainternals.files.wordpress.com/2013/06/screenshot_redo.jpg?w=300&#038;h=218" alt="screenshot_redo" width="300" height="218" class="aligncenter size-medium wp-image-1335" /></a> </p>
<p><b> Guess the object using ‘db block changes’ statistics </b></p>
<p> A quick method to guess the objects generating higher redo size is to use ‘db block changes’ statistics. The philosophy behind this technique is that, if the object is modified heavily then that object will probably generate more redo. But, it is not an entirely accurate statement as less frequently modified objects can generate more redo and vice versa. If you are lucky, one or two objects will stand out as a problem and you can review those segments further to reduce redo size. </p>
<pre>
@segment_stats.sql
To show all segment level statistics in one screen

Enter value for statistic_name: db block changes
old   6:           where value &gt;0  and statistic_name like '%'||'&amp;&amp;statistic_name' ||'%'
new   6:           where value &gt;0  and statistic_name like '%'||'db block changes' ||'%'
   INST_ID STATISTIC_NAME                 OWNER        OBJECT_NAME        OBJECT_TYP        VALUE   PERC
---------- ------------------------------ ------------ ----------------- ---------- ------------ ------
         1 db block changes               INV          SALES_TEMP_N1     INDEX        3831599856  48.66
         3                                INV          MTL_RESERV        TABLE        3794818912  23.78
         3                                ZX           DET_FACTORS_      INDEX        2468120576  15.47
         2                                APPLSYS      FND_TAB           TABLE        2346839248  16.33
….
  
</pre>
<p> Segment_stats.sql script can be found in <a href="http://orainternals.files.wordpress.com/2013/06/segment_stats.doc">segment_stats</a>. </p>
<p><b> Identify objects using logminer </b></p>
<p> Scientific method to identify the object generating higher redo uses log mining package. Objects can be identified by the following steps: </p>
<p>
Step 1: Start log miner from sys or system user in SQL*Plus. Example given here is for finprod2 instance archivelog file.
</p>
<pre>
begin
  sys.dbms_logmnr.ADD_LOGFILE ('/opt/app/prod/finprod2/arch/finprod_212_2_1212121221.arch');
end;
/
begin
  sys.dbms_logmnr.START_LOGMNR;
end;
/
</pre>
<p>Step 2: Create a table by querying the data from v$logmnr_contents dynamic performance view. I tend to create a separate table for each archive log file for two reasons: (a) to improve the query performance (b)I haven’t tested thoroughly with multiple archivelog files. Following SQL statement finds the length of redo record by subtracting the RBA (Redo Byte Address) of the current record from the RBA of next record. Redo byte address provides the physical location of a redo record in a redo log file. Using the physical location of current redo record and the next redo record, we can find the length of current redo record.
</p>
<p>
  Update 1: As Greg pointed out in comments section, I was using hard-coded 512 bytes for redo block size in my script, which is true in Solaris and Linux platform. But, in HP platform, redo block size is 1024 bytes. You can use the following SQL statement to identify the redo block size. I have modified the create table script to query redo block size dynamically. </p>
<p><pre>
SQL&gt;select max(lebsz) from x$kccle;
MAX(LEBSZ)
----------
       512
</pre>
<pre>
drop table redo_analysis_212_2;
CREATE TABLE redo_analysis_212_2 nologging AS
SELECT data_obj#, oper,
  rbablk * le.bsz + rbabyte curpos,
  lead(rbablk*le.bsz+rbabyte,1,0) over (order by rbasqn, rbablk, rbabyte) nextpos
FROM
  ( SELECT DISTINCT data_obj#, operation oper, rbasqn, rbablk, rbabyte
  FROM v$logmnr_contents
  ORDER BY rbasqn, rbablk, rbabyte
  ) ,
  (SELECT MAX(lebsz) bsz FROM x$kccle ) le 
/
</pre>
<p>Step 3: Query the table to identify the object_name: In this step, we join the table created and obj$ table to identify the objects inducing redo size. Outer join is needed as the object may have been dropped recently. START indicates the redo record for the start of a transaction and COMMIT indicates the redo record for the end of a transaction.</p>
<pre>
set lines 120 pages 40
column data_obj# format  9999999999
column oper format A15
column object_name format A60
column total_redo format 99999999999999
compute sum label 'Total Redo size' of total_Redo on report
break on report
spool /tmp/redo_212_2.lst
select data_obj#, oper, obj_name, sum(redosize) total_redo
from
(
select data_obj#, oper, obj.name obj_name , nextpos-curpos-1 redosize
from redo_analysis_212_2 redo1, sys.obj$ obj
where (redo1.data_obj# = obj.obj# (+) )
and  nextpos !=0 -- For the boundary condition
and redo1.data_obj#!=0
union all
select data_obj#, oper, 'internal ' , nextpos-curpos  redosize
from redo_analysis_212_2 redo1
where  redo1.data_obj#=0 and  redo1.data_obj# = 0
and nextpos!=0
)
group by data_obj#, oper, obj_name
order by 4
/
...
      46346 INSERT          WSH_EXCEPTIONS                        87006083
   12466144 INTERNAL        MSII_N9                               95800577
   12427363 INTERNAL        MSII_N1                               96445137
          0 START           internal                             125165844
          0 COMMIT          internal                             205600756
   12960642 UPDATE          XLA_GLT_1234567890                   243625297
                                                           ---------------
Total Redo                                                      3681252096
spool off

</pre>
<p> Notice that objects identified using log miner tool is not matching with the objects from db block changes statistics. In this example, the discrepancy is probably because, I am looking at segment stats from the start of instance which may not be accurate.
</p>
<p>In summary, log miner utility can be used to identify the objects generating higher redo. This will help you to understand why the redo generation is higher and may be, gives you a mechanism to reduce redo.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/orainternals.wordpress.com/1333/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/orainternals.wordpress.com/1333/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1333&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://orainternals.wordpress.com/2013/06/12/dude-where-is-my-redo/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/de30d27adb6aee87e455780e8cb19e7b?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">orainternals</media:title>
		</media:content>

		<media:content url="http://orainternals.files.wordpress.com/2013/06/screenshot_redo.jpg?w=300" medium="image">
			<media:title type="html">screenshot_redo</media:title>
		</media:content>
	</item>
		<item>
		<title>Clusterware Startup</title>
		<link>http://orainternals.wordpress.com/2013/06/05/clusterware-startup/</link>
		<comments>http://orainternals.wordpress.com/2013/06/05/clusterware-startup/#comments</comments>
		<pubDate>Wed, 05 Jun 2013 18:08:49 +0000</pubDate>
		<dc:creator>Riyaj Shamsudeen</dc:creator>
				<category><![CDATA[11g]]></category>
		<category><![CDATA[Oracle database internals]]></category>
		<category><![CDATA[RAC]]></category>
		<category><![CDATA[clusterware startup]]></category>
		<category><![CDATA[ohasd startup]]></category>
		<category><![CDATA[ohasdrun]]></category>
		<category><![CDATA[ohasdstr]]></category>
		<category><![CDATA[RC scripts clusterware]]></category>

		<guid isPermaLink="false">http://orainternals.wordpress.com/?p=1327</guid>
		<description><![CDATA[The restart of a UNIX server call initialization scripts to start processes and daemons. Every platform has a unique directory structure and follows a method to implement server startup sequence. In Linux platform (prior to Linux 6), initialization scripts are started by calling scripts in the /etc/rcX.d directories, where X denotes the run level of [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1327&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>
The restart of a UNIX server call initialization scripts to start processes and daemons.  Every platform has a unique directory structure and follows a method to implement server startup sequence. In Linux platform (prior to Linux 6), initialization scripts are started by calling scripts in the /etc/rcX.d directories, where X denotes the run level of the UNIX server. Typically, Clusterware is started at run level 3. For example, ohasd daemon started by /etc/rc3.d/S96ohasd file by supplying start as an argument. File S96ohasd is linked to /etc/init.d/ohasd.
</p>
<pre>
S96ohasd -&gt; /etc/init.d/ohasd

/etc/rc3.d/S96ohasd start  # init daemon starting ohasd.

</pre>
<p>Similarly, a server shutdown will call scripts in rcX.d directories, for example, ohasd is shut down by calling K15ohasd script:</p>
<pre>
K15ohasd -&gt; /etc/init.d/ohasd
/etc/rc3.d/K15ohasd stop  #UNIX daemons stopping ohasd

</pre>
<p>
In Summary, server startup will call files matching the pattern of S* in the /etc/rcX.d directories. Calling sequence of the scripts is in the lexical order of script name. For example, S10cscape will be called prior to S96ohasd, as the script S10cscape occurs  earlier in the lexical sequence.</p>
<p>Google if you want to learn further about RC startup sequence. Of course, Linux 6 introduces Upstart feature and the mechanism is a little different: <a href="http://en.wikipedia.org/wiki/Upstart" rel="nofollow">http://en.wikipedia.org/wiki/Upstart</a>
</p>
<p><b> That’s not the whole story! </b></p>
<p>
Have you ever thought why the ‘crsctl start crs’ returns immediately? You can guess that Clusterware is started in the background as the command returns to UNIX prompt almost immediately. Executing the crsctl command just modifies the ohasdrun file content to ‘restart’. It doesn’t actually perform the task of starting the clusterware. Daemon init.ohasd reads the ohasdrun file every few seconds and starts the Clusterware if the file content is changed to ‘restart’.
</p>
<p># cat  /etc/oracle/scls_scr/oel6rac1/root/ohasdrun<br />
restart</p>
<p>If you stop has using ‘crsctl stop has’ , then the ohasdstr file content is modified to stop and so, init.ohasd daemon will not restart Clusterware. However, stop command is synchronous and executes the stop of clusterware too.
</p>
<pre>
# crsctl stop has
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'oel6rac1'
CRS-2673: Attempting to stop 'ora.crsd' on 'oel6rac1'
..
</pre>
<p>The content of ohasdrun is modified to stop:</p>
<pre>
# cat  /etc/oracle/scls_scr/oel6rac1/root/ohasdrun
stop # 
</pre>
<p>In a nutshell, init.ohasd daemon is monitoring the ohasdrun file and starts the Clusterware stack if the value in the file is modified to restart. </p>
<p><b>Inittab </b></p>
<p>
Init.ohasd daemon is an essential daemon for Clusterware startup. Even if the Clusterware is not running on a node, you can start the Clusterware from a different node. How does that work? Init.ohasd is the reason.
</p>
<p>
The init.ohasd daemon is started from /etc/inittab. Entries in the inittab is monitored by the init daemon (pid=1) and init daemon will react if the inittab file is modified. The init daemon monitors all processes listed in the inittab file and reacts according to the configuration in the inittab file. For example, if init.ohasd fails for some reason, it is immediately restarted by init daemon.
</p>
<p>
Following is an example entry in the inittab file. Fields are separated a colon, second field indicates that init.ohasd will be started in run level 3, and the third field indicates an action field. Restart in the action field means that, if the target process exist, just continue scanning inittab file; if the target process does not exist, then restart the process.
</p>
<pre>
#cat /etc/inittab
…
h1:3:respawn:/etc/init.d/init.ohasd run &gt;/dev/null 2&gt;&amp;1 &lt;/dev/null
</pre>
<p>
If you issue a clusterware startup command from a remote node, that a message sent to init.ohasd daemon in the target node, and the daemon initates the clusterware startup. So, init.ohasd will be always running irrespective  of whether the Clusterware is running or not.
</p>
<p>
You can use strace on init.ohasd to verify this behavior. Following are a few relevant lines from the output of strace command of init.ohasd process:
</p>
<pre>
…
5641  1369083862.828494 open(&quot;/etc/oracle/scls_scr/oel6rac1/root/ohasdrun&quot;, O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
5641  1369083862.828581 dup2(3, 1)      = 1
5641  1369083862.828606 close(3)        = 0
5641  1369083862.828631 execve(&quot;/bin/echo&quot;, [&quot;/bin/echo&quot;, &quot;restart&quot;], [/* 12 vars */]) = 0
…
</pre>
<p><b>Just for fun!<br />
</b></p>
<p>
So, what happens if I manually modify that ohasdrun to restart? I copied the ohasdrun to a temporary file (/tmp/a1.lst) and stopped the clusterware.
</p>
<pre>
cp /etc/oracle/scls_scr/oel6rac1/root/ohasdrun /tmp/a1.lst 
# crsctl stop has
</pre>
<p>
I verified that Clusterware is completely stopped. Now, I will copy the file again overlaying ohasdrun:
</p>
<pre>
# cat /tmp/a1.lst
restart
# cp /tmp/a1.lst  /etc/oracle/scls_scr/oel6rac1/root/ohasdrun
</pre>
<p>
After a minute or so, I see that Clusterware processes are started. Not that, you would use this type of hack in a Production cluster, but this test proves my point.</p>
<p>It’s also important not to remove the files in the scls_scr directories. Any removal of the files underneath the scls_scr directory structure can lead to an invalid configuration.</p>
<p>There are also two more files in the scls_scr directory structure. Ohasdstr file decides if the HAS daemon should be started automatically or not. For example, if you execute ‘crsctl disable has’, that command modifies ohasdstr file contents to  ‘disable’. Similarly, crsstart file controls CRS daemon startup. Again, you should use recommended commands to control the startup, rather than modifying any of these files directly.
</p>
<p><b> 11.2.0.1 and HugePages </b></p>
<p>
If you tried to configure hugepages in 11.2.0.1 clusterware, by increasing memlock kernel parameter for GRID and database owner, you would have realized that database doesn’t use hugepages if started by the clusterware. Database startup using sqlplus will use hugepages, but the database startup using srvctl may not use hugepages.</p>
<p>As the new processes are cloned from the init.ohasd daemon, until init.ohasd is restarted, user level memlock limit changes are not correctly reflected in an already running process. Only recommended way to resolve the problem is to restart the node completely (not just the clusterware), as init.ohasd daemon must be restarted to reflect the user level limits. </p>
<p>Version 11.2.0.2 fixes this issue by explicitly calling ulimit command from /etc/init.d/ohasd files.
</p>
<p><b><br />
Summary </b><br />
In Summary, init.ohasd process is an important process. Files underneath scls_scr directory is controlling the startup behavior. This also means that if a server is restarted, you don’t need to explicitly stop the Clusterware. You can let the server startup to restart the Clusterware.</p>
<p>PS: Some of you have asked my why my blogging frequency has decreased. I have been extremely busy co-authoring a book on RAC titled &#8220;Expert RAC Practices 12c&#8221;. We are covering lots of interesting stuff in the book. As soon as 12c release is Production, we can release the book also.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/orainternals.wordpress.com/1327/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/orainternals.wordpress.com/1327/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1327&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://orainternals.wordpress.com/2013/06/05/clusterware-startup/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/de30d27adb6aee87e455780e8cb19e7b?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">orainternals</media:title>
		</media:content>
	</item>
		<item>
		<title>DOUG presentation on dbms_xplan</title>
		<link>http://orainternals.wordpress.com/2012/10/22/doug-presentation-on-dbms_xplan/</link>
		<comments>http://orainternals.wordpress.com/2012/10/22/doug-presentation-on-dbms_xplan/#comments</comments>
		<pubDate>Mon, 22 Oct 2012 14:54:56 +0000</pubDate>
		<dc:creator>Riyaj Shamsudeen</dc:creator>
				<category><![CDATA[Performance tuning]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[dbms_xplan]]></category>
		<category><![CDATA[dbms_xplan advanced]]></category>
		<category><![CDATA[dbms_xplan allstats last]]></category>
		<category><![CDATA[display_awr]]></category>
		<category><![CDATA[display_cursor]]></category>

		<guid isPermaLink="false">http://orainternals.wordpress.com/?p=1305</guid>
		<description><![CDATA[Please join us at the DOUG (DALLAS ORACLE USERS GROUP) Oracle Database Forum meeting on Thursday, October 25, 2012 from 5 pm – 7 pm. Presented by Riyaj Shamsudeen, OraInternals, &#38; Sahil Thapar: &#8220;Out with the old way, Enter dbms_xplan: A Swiss army knife for performance engineers&#8221; Rough outline: (i) Ability to query access path [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1305&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p> Please join us at the DOUG (DALLAS ORACLE USERS GROUP) Oracle Database Forum meeting on Thursday, October 25, 2012 from 5 pm – 7 pm.<br />
  Presented by Riyaj Shamsudeen, OraInternals, &amp; Sahil Thapar:<br />
<b><br />
  &#8220;Out with the old way, Enter dbms_xplan: A Swiss army knife for performance engineers&#8221;<br />
</b></p>
<p>  Rough outline:<br />
        (i) Ability to query access path from memory, AWR repository<br />
        (ii) Ability to use cardinality feedback method to understand access plan issues. Few tips from a real world experience will be provided too.<br />
        (iii) Ability to understand issues with database links etc.<br />
        (iv) Options such as ADVANCED, ALLSTATS etc<br />
        (v)  Why should you choose dbmx_xplan over tkprof+sql_trace combination?<br />
        (vi) Disadvantages of dbms_xplan and a quick introduction to dbms_monitor.</p>
<p>Refreshments sponsored by me <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  </p>
<p>Update: Uploading the presentation pdf files. Enjoy <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><a href='http://orainternals.files.wordpress.com/2012/10/dbms_xplan.pdf'>dbms_xplan</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/orainternals.wordpress.com/1305/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/orainternals.wordpress.com/1305/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1305&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://orainternals.wordpress.com/2012/10/22/doug-presentation-on-dbms_xplan/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/de30d27adb6aee87e455780e8cb19e7b?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">orainternals</media:title>
		</media:content>
	</item>
		<item>
		<title>Do you need asmlib?</title>
		<link>http://orainternals.wordpress.com/2012/08/29/do-you-need-asmlib/</link>
		<comments>http://orainternals.wordpress.com/2012/08/29/do-you-need-asmlib/#comments</comments>
		<pubDate>Wed, 29 Aug 2012 20:45:02 +0000</pubDate>
		<dc:creator>Riyaj Shamsudeen</dc:creator>
				<category><![CDATA[11g]]></category>
		<category><![CDATA[RAC]]></category>
		<category><![CDATA[asmlib]]></category>
		<category><![CDATA[device mapper]]></category>
		<category><![CDATA[multipath]]></category>
		<category><![CDATA[multipath.conf]]></category>
		<category><![CDATA[oracle performance]]></category>
		<category><![CDATA[oracle RAC asmlib]]></category>
		<category><![CDATA[udev]]></category>
		<category><![CDATA[udev rules]]></category>

		<guid isPermaLink="false">http://orainternals.wordpress.com/?p=1265</guid>
		<description><![CDATA[There are many questions from few of my clients about asmlib support in RHEL6, as they are gearing up to upgrade the database servers to RHEL6. There is a controversy about asmlib support in RHEL6.  As usual, I will only discuss technical details in this blog entry. ASMLIB is applicable only to Linux platform and [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1265&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>
There are many questions from few of my clients about asmlib support in RHEL6, as they are gearing up to upgrade the database servers to RHEL6. There is a controversy about asmlib support in RHEL6.  As usual, I will only discuss technical details in this blog entry.
</p>
<p>
ASMLIB is applicable only to Linux platform and does not apply to any other platform.
</p>
<p> Now, you might ask why bother and why not just use OEL and UK? Well, not every Linux server is used as a database server. In a typical company, there are hundreds of Linux servers and just few percent of those servers are used as Database servers. Linux system administrators prefer to keep one flavor of Linux distribution for management ease and so, asking clients to change the distribution from RHEL to OEL or OEL to RHEL is always not a viable option.
</p>
<p><strong> Do you need to use ASMLIB in Linux? </strong></p>
<p>
Short answer is No. Long answer is possibly No. ASMLIB is an optional support library and eases the administration of ASM devices. Especially, it is helpful while adding new devices to the nodes in a cluster. ASMLIB essentially stamps the devices and so, it is easily visible in other nodes of a cluster in the next asm scandisk. asmlib also provides device persistence, which is the important benefit of ASM (see the discussion below for more details about device persistence).
</p>
<p><span id="more-1265"></span></p>
<p>
But, how many times do you add disks to the servers? How many times, do we change the server or disk architectures in a given year? In my opinion, ASMLIB is an additional software layer. It is possible to setup RAC without ASMLIB and that&#8217;s the discussion of this blog.
</p>
<p><b> Problem definition </b></p>
<p>
Problem with devices in Linux is that the device name can change after a server reboot. That means that ASM might not come up since the device names may not be matching with asm_diskstring parameter after the reboot. Especially, since OCR and Voting disks can be stored in ASM devices from 11.2 onwards, debugging GI startup problems are painful if the device persistence is not setup properly.
</p>
<p>
 How do you resolve that?
</p>
<p><strong> Option #1: UDEV only </strong></p>
<p>
UDEV eliminates the device persistence problem, provides ability to create user defined aliases, and setup device permissions. While it might seem like another new concept to learn, UDEV is quite easy to use. I will explain basic UDEV setup. Of course, I am not covering numerous options available under UDEV, and covering just necessary items.
</p>
<p>
During server startup, when the kernel detects a device ( or when a new device is added), kernel sends an event to udevd daemon. udevd daemon uses rules to match the incoming event and takes action depending upon the rule (such as remove device node, add device node etc). udevd rules have many attributes and one such attribute is that, an arbitrary program can be used in the rule to process incoming events. and that can be used to return human-friendly device names. Essentially, to setup udev we need to write udev rules.
</p>
<p> ( Since you are probably a DBA if you are reading this blog, it is easier to imagine the UDEV rule as a function. That function accepts the scsi_id and sets up human friendly aliases.)
</p>
<p><b> Method to setup udev rule </b></p>
<ol>
<li> Modify /etc/scsi_id.config and add the following line at the end. Essentially, UDEV will assume that all SCSI devices will provide unique UUIDs.  </li>
<pre>
options=-g
</pre>
<li> Identify the unique SCSI id from scsi_id command. Command scsi_id will return unique value for a given SCSCI device. (This unique value is also called UUID (or WWID) if you use SAN arrays such as EMC or Hitachi etc) Size of the device can be identified using blockdev command. </li>
<p>( For RHEL6,  use –gud  as options for scsi_id command. Looks like, option –s is replaced by –d.)</p>
<p>For example, for the device /dev/sdd:<br />
# /sbin/scsi_id  -gus /block/sdd<br />
3600254567259abde00006000004c0000</p>
<p>To get device size, use blockdev (output in bytes):<br />
# /sbin/blockdev –getsize64 /dev/sdd</p>
<li> Now, setup rules: vi /etc/udev/rules.d/99-asmdevices.rules </li>
<p> Add a rule for the above device. A rule is essentially an if-then-else logic. In the rule, we specify is satisfied for an event, actions will be taken to setup the device. Refer to the rule printed below. If the event is a scsi device (KERNEL, BUS attributes), then call the /sbin/scsi_id -g -u -s program, passing the block device as first argument (PROGRAM attribute and %p in the rule definitin). If the RESULT of the program call matches with a value of 3600143801259abde00006000004c0000 (RESULT attribute in the rule), then create a device entry as &#8220;asmcrs01&#8243;, with owner as grid, group owner as oinstall, and permissions as 0660.</p>
<p>( one rule must be in a single line, but output below is wrapped. Make sure that rule doesn&#8217;t wrap aound in the .rules file).</p>
<p>KERNEL==&#8221;sd*&#8221;, BUS==&#8221;scsi&#8221;, PROGRAM==&#8221;/sbin/scsi_id -g -u -s %p&#8221;, RESULT==&#8221;3600254567259abde00006000004c0000&#8243;, NAME=&#8221;asmcrs01&#8243;,<br />
 OWNER=&#8221;grid&#8221;, GROUP=&#8221;oinstall&#8221;, MODE=&#8221;0660&#8243;</p>
<li> Test the rules using udevtest </li>
<p># udevtest /block/sdd</p>
<p>
  This would show that udev <em>might<em> </em>create three symlinks, a symlink named/dev/asmcrs01, one symlink in  /dev/disk/by-id/, and third symlink in /dev/disk/by-path/. We will use /dev/asmcrs01 symlink for ASM setup.
</p>
<li> Reload rules and start udev </li>
<p> This should create the symlink in /dev/asmcrs01.</p>
<p># /sbin/udevcontrol reload_rules<br />
# /sbin/start_udev</p>
<li> Now, setup asm_diskstring parameter to &#8216;/dev/asmcrs*&#8217; so that ASM will identify these devices. Repeat the above steps for all devices that you are planning to add to ASM. You could potentially decide to perform start_udev after all rules have been setup. </li>
<li> Once you are happy with one node setup, copy the file /etc/udev/rules.d/99-asmdevices.rules to all nodes of RAC cluster and restart udev. </li>
</ol>
<p><b> Option #2: Multipathing feature  </b></p>
<p> Multipathing feature provides fault tolerance for paths to storage devices and uses device mapper framework to map block devices to aliases. Even if you have just one path to the device, you could potentially setup this feature. I prefer this method at this time as it provides easier migration to multipathed devices in future.
</p>
<p> Setup is very similar to UDEV. Here is the step-by-step instruction. </p>
<ol>
<li> Verify that device mapper rpm version is compatible.</li>
<p>$ rpm –qa|grep device-mapper<br />
device-mapper-multipath-0.4.7-46.el5</p>
<li> Verify and configure devices
<li>
Verify that all SCSI devices are seen in all nodes.  Note that some devices will be seen multiple times through different HBAs. Identify the SCSI devices for the database.</p>
<p># lsscsi<br />
# fdisk -l</p>
<li> Modify /etc/scsi_id.config and add the following line at the end.This is for scsi_id to assume all sCSI devices will provide unique scsi id. </li>
<p>options=-g</p>
<li> Identify the unique SCSI Id from scsi_id command. Command scsi_id will return unique value for a given SCSI device . (This ID is also called UUID or WWID if you use SAN arrays such as EMC or Hitachi etc) Size of the device can be identified using blockdev command. </li>
<p>( For RHEL6,  use –gud  for scsi_id command. Looks like, option –s is replaced by –d.)</p>
<p>For example, for the device /dev/sdd:<br />
# /sbin/scsi_id  -gus /block/sdd<br />
3600254567259abde00006000004c0000</p>
<p>To get device size, use blockdev (output in bytes):<br />
# /sbin/blockdev –getsize64 /dev/sdd</p>
<li> Edit /etc/multipath.conf (of course take a backup of the file) </li>
<p>a.	Comment out this stanza.</p>
<pre>
# Blacklist all devices by default. Remove this to enable multipathing
# on the default devices. 
#blacklist {
#        devnode "*"
#}
</pre>
<p>b. Blacklist all local devices. Devices such as raw, loop, floppy disk etc doesn&#8217;t need to have multipathing configured. ( Remember that if you use raw device, you need to modify this procedure little bit as we are blacklisting raw devices here) .</p>
<p># Blacklist all local devices</p>
<pre>
blacklist {
        devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
        devnode "^hd[a-z][[0-9]*]"
        devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
        devnode "dasd[a-z]+[0-9]*"
}
</pre>
<p>c. Add this stanza for specific to SAN array.</p>
<pre>
##
## Essentially, you can setup attributes specific to disk array.
##   This would require you to check with vendor documentation. 
##    In this case, we setting up for HSV200/300 HP array.
## This stanza defines how multipathing should behave. This is specific to a disk array but
## can allow to use default values too.
devices {
        device {
                vendor "HP"
                product "HSV2[01]0|HSV300|HSV4[05]0"
                getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
                prio_callout "/sbin/mpath_prio_alua /dev/%n"
                hardware_handler "0"
                path_selector "round-robin 0"
                path_grouping_policy group_by_prio
                failback immediate
                rr_weight uniform
                no_path_retry 18
                rr_min_io 100
                path_checker tur
        }
}
</pre>
<p>d. Add following stanza. In this stanza, within a <em>multipaths </em> block, you would specify a device contained within another <em>multipath </em> block. Notice that UUID is what we got from scsi_id command earlier is used in the first part of the stanza for the device asmcrs01. We want to setup that SCSI device to have a name of asmcrs01 for the UUID 3600254567259abde00006000004c0000. This stanza will allow device mapper to create symlink as /dev/mapper/asmcrs01 for that device. Further device mapper, sets up permissions using uid/gid combination. ( Use correct uid/gid combination matching with your environment.) In this example, uid= 1100 =grid, gid=1000=oinstall. So, first <em>multiblock</em> stanza will create a device named /dev/mapper/asmcrs01 with permissions owned by grid:oinstall with 0660 permissions( i.e. read, write for owner and group, no permissions for other group).
</p>
<p>I am also setting up multiple devices below to provide an example.</p>
<pre>

##
## Multipathing for SCSI devices from storage array
##
multipaths {
  multipath {
    wwid 3600254567259abde00006000004c0000
    alias asmcrs01
    uid 1100
    gid 1000
    mode 660
   }
  multipath {
    wwid 3601213101259abde0000600000500000
    alias asmcrs02
    uid 1100
    gid 1000
    mode 660
   }
…
  multipath {
    wwid 3601212111259abde0000600000c00000
    alias asmdev15
    uid 1100
    gid 1000
    mode 660
   }
}
</pre>
<p>e. Enable multipath daemons and make sure that they are enabled at startup. </p>
<p># modprobe dm-multipath<br />
# service multipathd start<br />
# multipath –d<br />
# multipath –v2<br />
# multipath -v2<br />
create: asmcrs01 (3600254567259abde00006000004c0000)  HP,HSV300<br />
[size=2.0G][features=0][hwhandler=0][n/a]<br />
\_ round-robin 0 [prio=100][undef]<br />
 \_ 1:0:0:2  sdap 66:144 [undef][ready]<br />
 \_ 0:0:1:2  sdv  65:80  [undef][ready]<br />
\_ round-robin 0 [prio=10][undef]<br />
 \_ 1:0:1:2  sdbj 67:208 [undef][ready]<br />
..<br />
# chkconfig multipathd on<br />
# chkconfig &#8211;list multipathd<br />
multipathd      0:off   1:off   2:on    3:on    4:on    5:on    6:off</p>
<p>f. Copy /etc/multipath.conf to all remaining cluster nodes in the DB cluster. Repeat step 6 in all nodes.</p>
<p>g. At this point, we have setup /dev/mapper/asm* entries. asm_diskstring should be setup to match /dev/mapper/asm*.</p>
</ol>
<p>
  In essence, we can either use UDEV or Multipathing facilities to implement device persistence, and permissions without requiring ASMLIB to be setup.
</p>
<p>
 Update 1:<br />
   In RHEL6/OEL6, as uid/gid permissions through multipath.conf does not work (even though documentation supports these attributes), you can overcome the issue with an udev rule:<br />
For example:</p>
<p># dmsetup ls|grep p1<br />
asmcrs11p1	(253, 23)<br />
asmcrs01p1	(253, 15)<br />
asmcrs02p1	(253, 14)</p>
<p>#cat /etc/udev/rules.d/12-dm-permissions.rules<br />
ENV{DM_NAME}==&#8221;asmcrs01p1&#8243;, OWNER:=&#8221;oracle&#8221;, GROUP:=&#8221;oinstall&#8221;, MODE:=&#8221;660&#8243;<br />
ENV{DM_NAME}==&#8221;asmcrs02p1&#8243;, OWNER:=&#8221;oracle&#8221;, GROUP:=&#8221;oinstall&#8221;, MODE:=&#8221;660&#8243;<br />
ENV{DM_NAME}==&#8221;asmcrs03p1&#8243;, OWNER:=&#8221;oracle&#8221;, GROUP:=&#8221;oinstall&#8221;, MODE:=&#8221;660&#8243;<br />
ENV{DM_NAME}==&#8221;asmcrs04p1&#8243;, OWNER:=&#8221;oracle&#8221;, GROUP:=&#8221;oinstall&#8221;, MODE:=&#8221;660&#8243;<br />
ENV{DM_NAME}==&#8221;asmcrs05p1&#8243;, OWNER:=&#8221;oracle&#8221;, GROUP:=&#8221;oinstall&#8221;, MODE:=&#8221;660&#8243;<br />
ENV{DM_NAME}==&#8221;asmcrs06p1&#8243;, OWNER:=&#8221;oracle&#8221;, GROUP:=&#8221;oinstall&#8221;, MODE:=&#8221;660&#8243;</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/orainternals.wordpress.com/1265/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/orainternals.wordpress.com/1265/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1265&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://orainternals.wordpress.com/2012/08/29/do-you-need-asmlib/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/de30d27adb6aee87e455780e8cb19e7b?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">orainternals</media:title>
		</media:content>
	</item>
		<item>
		<title>Open World 2012 &#8211; My Sunday presentation on truss, pstack etc.</title>
		<link>http://orainternals.wordpress.com/2012/08/18/open-world-2012-my-sunday-presentation-on-truss-pstack-etc/</link>
		<comments>http://orainternals.wordpress.com/2012/08/18/open-world-2012-my-sunday-presentation-on-truss-pstack-etc/#comments</comments>
		<pubDate>Sat, 18 Aug 2012 16:18:34 +0000</pubDate>
		<dc:creator>Riyaj Shamsudeen</dc:creator>
				<category><![CDATA[Oracle database internals]]></category>
		<category><![CDATA[Performance tuning]]></category>
		<category><![CDATA[oracle performance]]></category>
		<category><![CDATA[pmap]]></category>
		<category><![CDATA[pstack]]></category>
		<category><![CDATA[truss]]></category>

		<guid isPermaLink="false">http://orainternals.wordpress.com/?p=1270</guid>
		<description><![CDATA[Just a quick note, I will be presenting on &#8220;Truss, pstack, pmap, and more&#8221; talking about advanced UNIX utilities and how it can be utilized to understand inner working of an application or even Oracle Database Engine. My timeslot is between 2:15 and 3:15 in Room 2016. http://blogs.ioug.org/2012/08/15/ioug-at-oracle-openworld-2012-the-sunday-technical-sessions-9302012/ Uploading presentation files. Thanks for attending at [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1270&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Just a quick note, I will be presenting on &#8220;Truss, pstack, pmap, and more&#8221; talking about advanced UNIX utilities and how it can be utilized to understand inner working of an application or even Oracle Database Engine.</p>
<p>My timeslot is between 2:15 and 3:15 in Room 2016.</p>
<p><a href="http://blogs.ioug.org/2012/08/15/ioug-at-oracle-openworld-2012-the-sunday-technical-sessions-9302012/" rel="nofollow">http://blogs.ioug.org/2012/08/15/ioug-at-oracle-openworld-2012-the-sunday-technical-sessions-9302012/</a></p>
<p>Uploading presentation files. Thanks for attending at OOW12.<br />
<a href='http://orainternals.files.wordpress.com/2012/08/pstack_truss_etc.pdf'>pstack_truss_etc</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/orainternals.wordpress.com/1270/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/orainternals.wordpress.com/1270/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1270&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://orainternals.wordpress.com/2012/08/18/open-world-2012-my-sunday-presentation-on-truss-pstack-etc/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/de30d27adb6aee87e455780e8cb19e7b?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">orainternals</media:title>
		</media:content>
	</item>
		<item>
		<title>June 2012: Jonathan Lewis is coming to Dallas</title>
		<link>http://orainternals.wordpress.com/2012/06/15/june-2012-jonathan-lewis-is-coming-to-dallas/</link>
		<comments>http://orainternals.wordpress.com/2012/06/15/june-2012-jonathan-lewis-is-coming-to-dallas/#comments</comments>
		<pubDate>Fri, 15 Jun 2012 19:52:32 +0000</pubDate>
		<dc:creator>Riyaj Shamsudeen</dc:creator>
				<category><![CDATA[Performance tuning]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[cost based optimizer presentations]]></category>
		<category><![CDATA[oracle performance]]></category>

		<guid isPermaLink="false">http://orainternals.wordpress.com/?p=1258</guid>
		<description><![CDATA[Quick note about Jonathan Lewis trip to Dallas: Jonathan Lewis will be presenting two day seminar on two topics, &#8220;Beating the Oracle Optimizer&#8221; (June 28) and &#8220;Troubleshooting and tuning&#8221; (June 29th). The event will be held June 28-29, 2012 at SMU-in-Legacy in Plano, TX. This is a must-attend event for experienced DBAs and Developers. Especially, [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1258&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Quick note about Jonathan Lewis trip to Dallas: <a href="http://jonathanlewis.wordpress.com/"> Jonathan Lewis </a> will be presenting two day seminar on two topics, &#8220;Beating the Oracle Optimizer&#8221; (June 28) and &#8220;Troubleshooting and tuning&#8221; (June 29th).</p>
<p>The event will be held June 28-29, 2012 at SMU-in-Legacy in Plano, TX. </p>
<p>This is a must-attend event for experienced DBAs and Developers. Especially, if you are planning to upgrade your database/application in the near-future or if you are in the middle of an upgrade, you must attend these two seminars. This seminar series provide enormous value resolving complex Production performance issues.</p>
<p>Click <a href="http://www.eventbrite.com/event/3082448687"> Here </a> for details.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/orainternals.wordpress.com/1258/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/orainternals.wordpress.com/1258/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1258&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://orainternals.wordpress.com/2012/06/15/june-2012-jonathan-lewis-is-coming-to-dallas/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/de30d27adb6aee87e455780e8cb19e7b?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">orainternals</media:title>
		</media:content>
	</item>
		<item>
		<title>Reverse Path Filtering and RAC</title>
		<link>http://orainternals.wordpress.com/2012/06/01/reverse-path-filtering-and-rac/</link>
		<comments>http://orainternals.wordpress.com/2012/06/01/reverse-path-filtering-and-rac/#comments</comments>
		<pubDate>Fri, 01 Jun 2012 20:18:25 +0000</pubDate>
		<dc:creator>Riyaj Shamsudeen</dc:creator>
				<category><![CDATA[Oracle database internals]]></category>
		<category><![CDATA[Performance tuning]]></category>
		<category><![CDATA[RAC]]></category>
		<category><![CDATA["has Disk HB]]></category>
		<category><![CDATA[advanced RAC training]]></category>
		<category><![CDATA[but no Network HB"]]></category>
		<category><![CDATA[cssd not joining cluster]]></category>
		<category><![CDATA[RAC performance]]></category>
		<category><![CDATA[reverse path filtering]]></category>
		<category><![CDATA[rp_filter]]></category>

		<guid isPermaLink="false">http://orainternals.wordpress.com/?p=1231</guid>
		<description><![CDATA[This is a quick note about reverse path filtering and impact of that feature to RAC. I encountered an interesting problem recently with a client and it is worth blogging about it, with a strong hope that it might help one of you in the future. Problem Environment is 11.2.0.2 GI, Linux 5.6. In a [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1231&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>
  This is a quick note about reverse path filtering and impact of that feature to RAC. I encountered an interesting problem recently with a client and it is worth blogging about it, with a strong hope that it might help one of you in the future.
</p>
<p><b> Problem </b></p>
<p>
  Environment is 11.2.0.2 GI, Linux 5.6. In a 3 node cluster, Grid Infrastructure (GI) comes up cleanly in just one node, but never comes up in other nodes. If we shutdown GI in first node, we can start the GI in second node with no issues. Meaning, GI can be up in just one node at any time.
</p>
<p> System Admins indicated that there are <em>no </em> major changes, only few bug fixes. Seemingly, problem started after those bug fixes. But there were few other changes to the environment /init.ora parameter change etc. So, the problem was not immediately attributable to just OS changes.
</p>
<p><span id="more-1231"></span><br />
<b> Analysis </b></p>
<p>
 Reviewing the GI alert log file, It was evident that CSSD daemon was not joining the cluster. CSSD log files indicated an Error message as &#8220;Other_node has Disk HB, but no Network HB&#8221;, implying that problem is with network layer. Normal checks such as ping, traceroute etc to all other nodes are successful ( and network admin/sysadmin simply said that this is an Oracle issue as the ping/traceroute is working fine).
</p>
<p>
  <b> Update 1: An Important note, </b> After reading Brian&#8217;s comment below, I decided to clarify my blog entry. &#8220;Other_node has Disk HB, but no Network HB&#8221; error can happen for many reasons, but almost all those reasons will distill down to some type of network configuration issue. Essentially, this error means that network packets or multicast packets are not flowing through properly between the nodes. In this entry, I am discussing JUST ONE of that reason. If you encounter &#8220;Other_node has Disk HB, but no Network HB&#8221; error, you should review your network configuration carefully and review note 1054902.1 &#8220;How to Validate Network and Name Resolution Setup for the Clusterware and RAC &#8220;. [ Multicast issues are less prevalent (almost non-existent) in 11.2.0.3 version though as the software handles the multicast issues beautifully now ( essentially, tries 230.x.x.x IP range and then 224.x.x.x IP range automatically) ].
</p>
<p>
  Time for advanced tools! With tcpdump and wireshark, I was able to see that packets were leaving the surviving node, but not received in the other node (and vice versa). Also checked the packets in the switch (port mirroring) and could see that packets are flowing through the switch with no issues..
</p>
<p>
  Why would the packets received in the interface will not show up in the wireshark output? Kernel must be somehow filtering the packets.<br />
  At this point, we need to prove that packets are thrown away by the kernel. Interestingly named log_martians kernel parameter came handy. After changing the parameters net.ipv4.conf.eth3.log_martians and net.ipv4.conf.eth4.log_martians to 1, System admins confirmed that packets were disregarded by the kernel.
</p>
<p>
Started reviewing the sysctl.conf and comparing with old copy of sysctl.conf, there are no notable differences between the files for the past few weeks. So, no kernel parameter change either.
</p>
<p><b> A puzzle!</b></p>
<p> I was expecting to see some kernel parameter change that would tell the Kernel to filter the packets, such as firewall etc. Not seeing any change, I was baffled by the mystery.
</p>
<p> Finally, decided to review all OS changes. A notable change from OS point of view stuck out: Kernel was upgraded from 2.6.18 to 2.6.32. While that doesn&#8217;t look like a major change, it is relevant since we know that Kernel is throwing away packets for some reason.
</p>
<p>
 Then, I recollected seeing a note about 2.6.32 in MOS and searched for 2.6.32 string. Note 1286796.1 was exactly what I was remembering &#8221; rp_filter for multiple private interconnects and Linux Kernel 2.6.32&#8243;.
</p>
<p><b> Reverse Path Filtering </b></p>
<p> Reverse Path Filtering (RPF) is a security feature, if the reply of a packet may not go through the interface it was received on, that the kernel can throw away the packets. Ironically, this is not a new feature, just that 2.6.32 kernel fixed a bug and so, RPF started to work. This bug fix in 2.6.32 kernel affects private interconnect traffic.
</p>
<p>
Solution was simple, disable RPF for private interfaces. Modify /etc/sysctl.conf and add following two kernel parameter and then perform sysctl -p (Read that ML note (1286796.1) for complete description).<br />
net.ipv4.conf.eth3.rp_filter=2<br />
net.ipv4.conf.eth4.rp_filter=2
</p>
<p>
 I wish that this is documented better so that this weird problem can be avoided. I also want to make it clear that not all CSSD heart beat issues can be attributed to this RPF. It is just that this client was unfortunate enough to encounter this issue.
</p>
<p>  As a side node, I have also scheduled next RAC training class in Aug/Sept 2012:<br />
<a href="http://www.orainternals.com/services/training/advanced-rac-training/" /> Advanced RAC_Training</a> </p>
<p>Update 1: Fixing the link for training.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/orainternals.wordpress.com/1231/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/orainternals.wordpress.com/1231/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1231&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://orainternals.wordpress.com/2012/06/01/reverse-path-filtering-and-rac/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/de30d27adb6aee87e455780e8cb19e7b?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">orainternals</media:title>
		</media:content>
	</item>
		<item>
		<title>All about RAC and MTU with a video</title>
		<link>http://orainternals.wordpress.com/2012/05/22/all-about-rac-and-mtu-with-a-video/</link>
		<comments>http://orainternals.wordpress.com/2012/05/22/all-about-rac-and-mtu-with-a-video/#comments</comments>
		<pubDate>Tue, 22 May 2012 14:54:46 +0000</pubDate>
		<dc:creator>Riyaj Shamsudeen</dc:creator>
				<category><![CDATA[11g]]></category>
		<category><![CDATA[Oracle database internals]]></category>
		<category><![CDATA[Performance tuning]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[RAC]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[cache fusion mtu]]></category>
		<category><![CDATA[fragmentation and reassembly]]></category>
		<category><![CDATA[gc lost packets]]></category>
		<category><![CDATA[ipfrag_high_thres]]></category>
		<category><![CDATA[ipfrag_low_thres]]></category>
		<category><![CDATA[ipfrag_time]]></category>
		<category><![CDATA[Jumbo frames]]></category>
		<category><![CDATA[MTU]]></category>
		<category><![CDATA[MTU=9000]]></category>
		<category><![CDATA[oracle performance]]></category>
		<category><![CDATA[RAC internals]]></category>
		<category><![CDATA[RAC performance]]></category>
		<category><![CDATA[RAC presentations]]></category>
		<category><![CDATA[RAC training]]></category>
		<category><![CDATA[RAC video]]></category>
		<category><![CDATA[RAC videos]]></category>
		<category><![CDATA[RDS]]></category>
		<category><![CDATA[UDP vs tcp]]></category>
		<category><![CDATA[wireshark]]></category>

		<guid isPermaLink="false">http://orainternals.wordpress.com/?p=1202</guid>
		<description><![CDATA[Let&#8217;s first discuss how RAC traffic works before continuing. Environment for the discussion is: 2 node cluster with 8K database block size, UDP protocol is used for cache fusion. (BTW, UDP and RDS protocols are supported in UNIX platform; whereas Windows uses TCP protocol). UDP protocol, fragmentation, and assembly UDP Protocol is an higher level [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1202&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Let&#8217;s first discuss how RAC traffic works before continuing. Environment for the discussion is: 2 node cluster with 8K database block size, UDP protocol is used for cache fusion. (BTW, UDP and RDS protocols are supported in UNIX platform; whereas Windows uses TCP protocol).</p>
<p><strong> UDP protocol, fragmentation, and assembly </strong></p>
<p>UDP Protocol is an higher level protocol stack, and it is implemented over IP Protocol ( UDP/IP). Cache Fusion uses UDP protocol to send packets over the wire (Exadata uses RDS protocol though).</p>
<p>MTU defines the Maximum Transfer Unit of an IP packet. Let us consider an example of MTU set to 1500 in a network interface. One 8K block transfer can not be performed with just one IP packet  as the IP packet size (1500 bytes) is less than 8K. So, one transfer of UDP packet of 8K size is fragmented to 6 IP packets and sent over the wire. In the receiving side, those 6 packets are reassembled to create one UDP buffer of size 8K. After the assembly, that UDP buffer is delivered to an UDP port of a UNIX process. Usually, a foreground process will listen on that port to receive the UDP buffer.</p>
<p><span id="more-1202"></span> </p>
<p>Consider what happens If MTU is set to 9000 in the network interface:  Then 8K buffer can be transmitted over the wire with just one IP packet. There is no need for fragmentation or reassembly with MTU=9000 as long as the block size is less than 8K. MTU=9000 is also known as jumbo frame configuration.  ( But, if the database block size is greater than jumbo frame then fragmentation and reassembly is still required. For example, for 32KB size, with MTU=9000,  there will three 9K IP packets  and one 5K IP packet to be transmitted).</p>
<p>
Fragmentation and reassembly is performed at OS Kernel layer level and hence it is the responsibility of Kernel and the stack below to complete the fragmentation and assembly. Oracle code simply calls the send and receive system calls, passes the buffers to populate.
</p>
<p> Few LMS system calls in Solaris platform: </p>
<pre>
0.6178  0.0001 sendmsg(30, 0xFFFFFFFF7FFF7060, 32768)          = 8328
0.6183  0.0004 sendmsg(30, 0xFFFFFFFF7FFFABE0, 32768)          = 8328
0.6187  0.0001 sendmsg(36, 0xFFFFFFFF7FFFBA10, 32768)          = 144
...
0.7241  0.0001 recvmsg(27, 0xFFFFFFFF7FFF9A10, 32768)          = 192
0.7243  0.0001 recvmsg(27, 0xFFFFFFFF7FFF9A10, 32768)          = 192

</pre>
<p><strong>UDP vs TCP </strong></p>
<p>If you talk to a network admin about use of UDP for cache fusion, usually, there will be few eyebrows raised about the use of UDP. From RAC point of view, UDP is the right choice over TCP for cache fusion traffic. With TCP/IP, for every packet transfer has overhead, connection need to be setup, packet sent, and the process must wait for TCP Acknowledgement before considering the packet send as complete. In a busy RAC systems, we are talking about 2-3 milli-seconds for packet transfer and with TCP/IP, we probably may not be able to achieve that level of performance. With UDP, packet transfer is considered complete, as soon as packet is sent and error handling is done by Oracle code itself. As you know, reliable network is a key to RAC stability, if much of packets (closer to 100%) are sent without any packet drops, UDP is a good choice over TCP/IP for performance reasons. </p>
<p><p> If there are reassembly failures, then it is a function of unreliable network or kernel or something else, but nothing to do with the choice of UDP protocol itself. Of course, RDS is better than UDP as the error handling is offloaded to the fabric, but usually require, infiniband fabric for a proper RDS setup. For that matter, VPN connections use UDP protocol too.
</p>
<p><strong>IP identification</strong><strong>?</strong></p>
<p>In a busy system, there will be thousands of IP packets traveling in the interface, in a given second. So, obviously, there will be many IP packets from different UDP buffers received by the interface. Also, because these ethernet frames can be delivered in any order, how does Kernel know how to assemble them properly? More critically, how does the kernel know that 6 IP packets from one UDP buffer belongs together and the order of those IP packets?</p>
<p>Each of these IP packet has an IP identification and fragment offset. Review the wireshark files uploaded in this blog entry, you will see that all 6 IP packets will have the same IP identification. That ID and the fragment offset is used by the kernel to assemble the IP packets together to create UDP buffer.</p>
<pre>
Identification: 0x533e (21310)
..
Fragment offset: 0
</pre>
<p><strong>Reassembly failures</strong></p>
<p>What happens if an IP packet is lost, assuming MTU=1500 bytes?</p>
<p>
From the wireshark files with mtu1500, you will see that each of the packet have a Fragment offset. That fragment offset and IP identification is used to reassemble the IP packets to create 8K UDP buffer. Consider that there are 6 puzzle pieces, each puzzle piece with markings, and Kernel uses those markings( offset and IP ID) to reassemble the packets. Let&#8217;s consider the case, one of 6 packet never arrived, then the kernel threads will keep those 5 IP packets in memory for 64 seconds( Linux kernel parameter ipfrag_time controls that time) before declaring reassembly failure. Without receiving the missing IP packet, kernel can not reassemble the UDP buffer, and so, reassembly failure is declared.
</p>
<p>Oracle foreground process will wait for 30 seconds (it used to be 300 seconds or so in older version of RAC) and if the packet is not arrived within that timeout period, FG process will declare a &#8216;GC lost packet&#8217; and re-request the block. Of course, kernel memory allocated for IP fragmentation and assembly is constrained by Kernel parameter ipfrag_high_thres and ipfrag_low_thres and lower values for these kernel parameters can lead to reassembly failures too (and that&#8217;s why it is important to follow all best practices from RAC installation guides).</p>
<p> BTW, there are few other reasons for &#8216;gc lost packets&#8217; too. High CPU usage also can lead to &#8216;gc lost packets&#8217; failures too, as the process may not have enough cpu time to drain the buffers, network buffers allocated for that process becomes full, and so, kernel will drop incoming packets.
</p>
<p>
It is probably better to explain these concepts visually. So, I created a video. When you watch this video, notice that there is HD button on the top of the video. Play this in HD mode so that you will have better learning experience.
</p>
<p>You can get the presentation file from the video here: <a href="http://orainternals.files.wordpress.com/2012/05/mtu.pdf">MTU</a></p>
<p>Wireshark files explained in the video can be reviewed here:<br />
<a href="http://orainternals.files.wordpress.com/2012/05/wireshark_1500mtu.pdf">wireshark_1500mtu</a><br />
<a href="http://orainternals.files.wordpress.com/2012/05/wireshark_9000mtu.pdf">wireshark_9000mtu</a></p>
<p>BTW, when you review the video, you will see that I had little bit trouble identifying the packet in the wireshark output initially. I understood the reason for not seeing the packets filled with DEADBEEF characters. Why do you think I didn&#8217;t see the packets initially?</p>
<p>Also, looks like, video quality is not that great when embedded. If you want actual mp4 files,  let me know, may be I can upload to a drop box and let you download, email me.</p>
<div id="v-HHaR813y-1" class="video-player" style="width:400px;height:250px">
<embed id="v-HHaR813y-1-video" src="http://s0.videopress.com/player.swf?v=1.03&amp;guid=HHaR813y&amp;isDynamicSeeking=true" type="application/x-shockwave-flash" width="400" height="250" title="MTU Vidoe" wmode="direct" seamlesstabbing="true" allowfullscreen="true" allowscriptaccess="always" overstretch="true"></embed></div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/orainternals.wordpress.com/1202/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/orainternals.wordpress.com/1202/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1202&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" /><div><a href="http://orainternals.wordpress.com/2012/05/22/all-about-rac-and-mtu-with-a-video/"><img alt="MTU Vidoe" src="http://videos.videopress.com/HHaR813y/mtu-vidoe_std.original.jpg" width="160" height="120" /></a></div>]]></content:encoded>
			<wfw:commentRss>http://orainternals.wordpress.com/2012/05/22/all-about-rac-and-mtu-with-a-video/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
	<enclosure url="http://videos.videopress.com/HHaR813y/mtu-vidoe_dvd.mp4" length="469988352" type="video/mp4" />

		<media:content url="http://1.gravatar.com/avatar/de30d27adb6aee87e455780e8cb19e7b?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">orainternals</media:title>
		</media:content>

		<media:group>
			<media:content url="http://videos.videopress.com/HHaR813y/mtu-vidoe_dvd.mp4" fileSize="469988352" type="video/mp4" medium="video" bitrate="1528" isDefault="true" duration="2403" width="640" height="400" />

			<media:content url="http://videos.videopress.com/HHaR813y/mtu-vidoe_std.mp4" fileSize="244836864" type="video/mp4" medium="video" bitrate="796" isDefault="false" duration="2403" width="400" height="250" />

			<media:content url="http://videos.videopress.com/HHaR813y/mtu-vidoe_fmt1.ogv" fileSize="244836864" type="video/ogg" medium="video" bitrate="796" isDefault="false" duration="2403" width="400" height="250" />

			<media:rating scheme="urn:mpaa">g</media:rating>
			<media:title type="plain">MTU Vidoe</media:title>
			<media:thumbnail url="http://videos.videopress.com/HHaR813y/mtu-vidoe_std.original.jpg" width="256" height="160" />
			<media:player url="http://s0.videopress.com/player.swf?v=1.03&#38;guid=HHaR813y&#38;isDynamicSeeking=true" width="400" height="250" />
		</media:group>
	</item>
		<item>
		<title>_gc_fusion_compression</title>
		<link>http://orainternals.wordpress.com/2012/04/29/_gc_fusion_compression/</link>
		<comments>http://orainternals.wordpress.com/2012/04/29/_gc_fusion_compression/#comments</comments>
		<pubDate>Sun, 29 Apr 2012 02:58:07 +0000</pubDate>
		<dc:creator>Riyaj Shamsudeen</dc:creator>
				<category><![CDATA[11g]]></category>
		<category><![CDATA[Oracle database internals]]></category>
		<category><![CDATA[Performance tuning]]></category>
		<category><![CDATA[RAC]]></category>
		<category><![CDATA[RAC internals]]></category>
		<category><![CDATA[RAC performance]]></category>
		<category><![CDATA[RAC performance myths]]></category>
		<category><![CDATA[_gc_fusion_compression]]></category>

		<guid isPermaLink="false">http://orainternals.wordpress.com/?p=1178</guid>
		<description><![CDATA[We know that database blocks are transferred between the nodes through the interconnect, aka cache fusion traffic. Common misconception is that packet transfer size is always database block size for block transfer (Of course, messages are smaller in size). That&#8217;s not entirely true. There is an optimization in the cache fusion code to reduce the [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1178&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>
 We know that database blocks are transferred between the nodes through the interconnect, aka cache fusion traffic. Common misconception is that packet transfer size is <em>always</em> database block size for block transfer (Of course, messages are smaller in size). That&#8217;s not entirely true. There is an optimization in the cache fusion code to reduce the packet size  (and so reduces the bits transferred over the private network). Don&#8217;t confuse this note with Jumbo frames and MTU size, this note is independent of MTU setting.
</p>
<p>
In a nutshell, if free space in a block exceeds a threshold (_gc_fusion_compression) then instead of sending the whole block, LMS sends a smaller packet, reducing private network traffic bits. Let me give an example to illustrate my point. Let&#8217;s say that the database block size is 8192 and a block to be transferred is a recently NEWed block, say, with 4000 bytes of free space. Transfer of this block over the interconnect from one node to another node in the cluster will result in a packet size of ~4200 bytes. Transfer of bytes representing free space can be avoided completely, just a symbolic notation of free space begin offset and free space end offset is good enough to reconstruct the block in the receiving side without any loss of data.This optimization makes sense as there is no need to clog the network unnecessarily.
</p>
<p><span id="more-1178"></span></p>
<p>
Remember that this is not a compression in a traditional sense, rather, avoidance of sending unnecessary bytes.
</p>
<p>
 Parameter _gc_fusion_compression determines the threshold and defaults to 1024 in 11.2.0.3. So, if the free space in the block is over 1024 then the block is candidate for the reduction in packet size.
</p>
<p><b> Test cases and dumps </b></p>
<p>
From the test cases, I see that three fields in the block can be used to determine the free space available in the block. If you dump a block using &#8216;alter system dump datafile..&#8217; syntax, you would see the following three fields:
</p>
<pre>
fsbo=0x26 
fseo=0x1b6a
avsp=0x1b44
</pre>
<p>
fsbo stands for Free Space Begin Offset; fseo stands for Free Space End Offset; avsp stands for AVailable free SPace;
</p>
<p>
It <i> seems </i> to me from the test cases that LMS process looks up these fields and constructs the buffer depending upon the value of avsp field. If avsp exceeds 1024 then the buffer is smaller than 8K ( smaller than 7K for that matter). Following few lines explains my test results.
</p>
<p>
Initially, I had just one row (row length =105 bytes), and the wireshark packet analysis shows that one 8K block transfer resulted in a 690 bytes packet transfer. Meaning, the size of network packet was just 690 bytes for on 8192 block transfer. A massive reduction in GC traffic.
</p>
<p>
In test case #2, with 10 rows in the block, size of the packet transfer was 1680 bytes. Block dump shows that avsp=0x1b44 (6980 bytes) buckets with just 1212 bytes of useful information. Cache fusion code avoided sending 6980 bytes and reduced the transferred packet size to just 1680 bytes.
</p>
<p>
In test case #3, with 50 rows in the block, size of the transferred packet was 5776 bytes. free space was 2620 bytes in the block.
</p>
<p>
This behavior continued until the free space was just above 1024. When the free space was below 1024 (I accidentally added more rows and so free space dropped to ~900 bytes), then whole block was transferred and the size of packet was 8336 bytes.
</p>
<pre>
fsbo=0x96
fseo=0x402
avsp=0x36c
</pre>
<p>
  These test cases prove that cache fusion code is optimizing the packet transfer by eliminating the bytes representing free space.
</p>
<p><b> More test cases </b></p>
<p> So, what happens if you delete rows in the block? Remember that rows are not physically deleted and just tagged with a D flag in the row directory and so, free space information remains the same. Even if you delete 90% of the rows in the block, until block defragmentation happens, avsp field is not updated. This means that just deletion of rows will still result in whole block transfer, until the block is defragmented.
</p>
<pre>
# After deletion of nearly all rows in the block.
fsbo=0x96
fseo=0x402
avsp=0x36c
</pre>
<p> I increased the value of _gc_fusion_compression parameter to 4096, then to a value of 8192. Repeated the tests. Behavior is confirmed: When I set this parameter to a value of 8192, a block with just one row transfer resulted in a packet size of 8336, meaning, this optimization simply did not kick in ( as the free space in the block will never be greater than 8192).
</p>
<p><b> !!!Warning!!! </b></p>
<p> Yes, with 0&#215;6 exclamation symbols! This note is to improve the understanding of cache fusion traffic, not a recommendation for you to change it. This parameter better left untouched.
</p>
<p> This is a very cool optimization feature. Useful in data warehouse databases with 32K block size. I am not sure, in which version this optimization was introduced though. </p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/orainternals.wordpress.com/1178/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/orainternals.wordpress.com/1178/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1178&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://orainternals.wordpress.com/2012/04/29/_gc_fusion_compression/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/de30d27adb6aee87e455780e8cb19e7b?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">orainternals</media:title>
		</media:content>
	</item>
		<item>
		<title>My COLLABORATE 12-IOUG sessions</title>
		<link>http://orainternals.wordpress.com/2012/04/19/my-collaborate-12-ioug-sessions/</link>
		<comments>http://orainternals.wordpress.com/2012/04/19/my-collaborate-12-ioug-sessions/#comments</comments>
		<pubDate>Thu, 19 Apr 2012 20:06:30 +0000</pubDate>
		<dc:creator>Riyaj Shamsudeen</dc:creator>
				<category><![CDATA[Oracle database internals]]></category>
		<category><![CDATA[Performance tuning]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[RAC]]></category>
		<category><![CDATA[collaborate 2012 presentations]]></category>
		<category><![CDATA[haip]]></category>
		<category><![CDATA[pfiles]]></category>
		<category><![CDATA[pmap]]></category>
		<category><![CDATA[pstack]]></category>
		<category><![CDATA[RAC performance]]></category>
		<category><![CDATA[RAC presentations]]></category>
		<category><![CDATA[scan]]></category>
		<category><![CDATA[semtimedop]]></category>
		<category><![CDATA[strace]]></category>
		<category><![CDATA[truss]]></category>
		<category><![CDATA[vip]]></category>

		<guid isPermaLink="false">http://orainternals.wordpress.com/?p=1165</guid>
		<description><![CDATA[If you are attending Collaborate 2012, you might be interested in my content-rich sessions below : Session Number: 326 Session Title: SCAN, VIP, HAIP, and other RAC acronyms Session Date/Time/Room: Tue, Apr 24, 2012 (10:45 AM &#8211; 11:45 AM) : Surf C Session Number: 327 Session Title: Internals and Performance Boot Camp: Truss, pstack, pmap, [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1165&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>If you are attending Collaborate 2012, you might be interested in my content-rich sessions below :</p>
<p>Session Number: 326<br />
Session Title: SCAN, VIP, HAIP, and other RAC acronyms<br />
Session Date/Time/Room: Tue, Apr 24, 2012 (10:45 AM &#8211; 11:45 AM) : Surf C</p>
<p>Session Number: 327<br />
Session Title: Internals and Performance Boot Camp: Truss, pstack, pmap, and more<br />
Session Date/Time/Room: Wed, Apr 25, 2012 (03:00 PM &#8211; 04:00 PM) : Palm A</p>
<p>Hope to see you there!</p>
<p><strong>Update</strong>: I am uploading presentation files. Presentations are much more recent than the document <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  </p>
<p><a href='http://orainternals.files.wordpress.com/2012/04/pstack_truss_etc.pdf'>pstack_truss_etc</a><br />
<a href='http://orainternals.files.wordpress.com/2012/04/2012_327_riyaj_pstack_truss_doc.pdf'>2012_327_Riyaj_pstack_truss_doc</a><br />
<a href='http://orainternals.files.wordpress.com/2012/04/scan_vip_haip_etc.pdf'>SCAN_VIP_HAIP_etc</a><br />
<a href='http://orainternals.files.wordpress.com/2012/04/2012_326_riyaj_scan_vip_haip_doc.pdf'>2012_326_Riyaj_scan_vip_haip_doc</a></p>
<p>Thanks for attending!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/orainternals.wordpress.com/1165/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/orainternals.wordpress.com/1165/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=orainternals.wordpress.com&#038;blog=670821&#038;post=1165&#038;subd=orainternals&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://orainternals.wordpress.com/2012/04/19/my-collaborate-12-ioug-sessions/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/de30d27adb6aee87e455780e8cb19e7b?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">orainternals</media:title>
		</media:content>
	</item>
	</channel>
</rss>
