Posted by Riyaj Shamsudeen on April 29, 2012
We know that database blocks are transferred between the nodes through the interconnect, aka cache fusion traffic. Common misconception is that packet transfer size is always database block size for block transfer (Of course, messages are smaller in size). That’s not entirely true. There is an optimization in the cache fusion code to reduce the packet size (and so reduces the bits transferred over the private network). Don’t confuse this note with Jumbo frames and MTU size, this note is independent of MTU setting.
In a nutshell, if free space in a block exceeds a threshold (_gc_fusion_compression) then instead of sending the whole block, LMS sends a smaller packet, reducing private network traffic bits. Let me give an example to illustrate my point. Let’s say that the database block size is 8192 and a block to be transferred is a recently NEWed block, say, with 4000 bytes of free space. Transfer of this block over the interconnect from one node to another node in the cluster will result in a packet size of ~4200 bytes. Transfer of bytes representing free space can be avoided completely, just a symbolic notation of free space begin offset and free space end offset is good enough to reconstruct the block in the receiving side without any loss of data.This optimization makes sense as there is no need to clog the network unnecessarily.
Remember that this is not a compression in a traditional sense, rather, avoidance of sending unnecessary bytes.
Parameter _gc_fusion_compression determines the threshold and defaults to 1024 in 18.104.22.168. So, if the free space in the block is over 1024 then the block is candidate for the reduction in packet size.
Test cases and dumps
From the test cases, I see that three fields in the block can be used to determine the free space available in the block. If you dump a block using ‘alter system dump datafile..’ syntax, you would see the following three fields:
fsbo=0x26 fseo=0x1b6a avsp=0x1b44
fsbo stands for Free Space Begin Offset; fseo stands for Free Space End Offset; avsp stands for AVailable free SPace;
It seems to me from the test cases that LMS process looks up these fields and constructs the buffer depending upon the value of avsp field. If avsp exceeds 1024 then the buffer is smaller than 8K ( smaller than 7K for that matter). Following few lines explains my test results.
Initially, I had just one row (row length =105 bytes), and the wireshark packet analysis shows that one 8K block transfer resulted in a 690 bytes packet transfer. Meaning, the size of network packet was just 690 bytes for on 8192 block transfer. A massive reduction in GC traffic.
In test case #2, with 10 rows in the block, size of the packet transfer was 1680 bytes. Block dump shows that avsp=0x1b44 (6980 bytes) buckets with just 1212 bytes of useful information. Cache fusion code avoided sending 6980 bytes and reduced the transferred packet size to just 1680 bytes.
In test case #3, with 50 rows in the block, size of the transferred packet was 5776 bytes. free space was 2620 bytes in the block.
This behavior continued until the free space was just above 1024. When the free space was below 1024 (I accidentally added more rows and so free space dropped to ~900 bytes), then whole block was transferred and the size of packet was 8336 bytes.
fsbo=0x96 fseo=0x402 avsp=0x36c
These test cases prove that cache fusion code is optimizing the packet transfer by eliminating the bytes representing free space.
More test cases
So, what happens if you delete rows in the block? Remember that rows are not physically deleted and just tagged with a D flag in the row directory and so, free space information remains the same. Even if you delete 90% of the rows in the block, until block defragmentation happens, avsp field is not updated. This means that just deletion of rows will still result in whole block transfer, until the block is defragmented.
# After deletion of nearly all rows in the block. fsbo=0x96 fseo=0x402 avsp=0x36c
I increased the value of _gc_fusion_compression parameter to 4096, then to a value of 8192. Repeated the tests. Behavior is confirmed: When I set this parameter to a value of 8192, a block with just one row transfer resulted in a packet size of 8336, meaning, this optimization simply did not kick in ( as the free space in the block will never be greater than 8192).
Yes, with 0x6 exclamation symbols! This note is to improve the understanding of cache fusion traffic, not a recommendation for you to change it. This parameter better left untouched.
This is a very cool optimization feature. Useful in data warehouse databases with 32K block size. I am not sure, in which version this optimization was introduced though.