Oracle database internals by Riyaj

Discussions about Oracle performance tuning, RAC, Oracle internal & E-business suite.

Do you need asmlib?

Posted by Riyaj Shamsudeen on August 29, 2012

There are many questions from few of my clients about asmlib support in RHEL6, as they are gearing up to upgrade the database servers to RHEL6. There is a controversy about asmlib support in RHEL6.  As usual, I will only discuss technical details in this blog entry.

ASMLIB is applicable only to Linux platform and does not apply to any other platform.

Now, you might ask why bother and why not just use OEL and UK? Well, not every Linux server is used as a database server. In a typical company, there are hundreds of Linux servers and just few percent of those servers are used as Database servers. Linux system administrators prefer to keep one flavor of Linux distribution for management ease and so, asking clients to change the distribution from RHEL to OEL or OEL to RHEL is always not a viable option.

Do you need to use ASMLIB in Linux?

Short answer is No. Long answer is possibly No. ASMLIB is an optional support library and eases the administration of ASM devices. Especially, it is helpful while adding new devices to the nodes in a cluster. ASMLIB essentially stamps the devices and so, it is easily visible in other nodes of a cluster in the next asm scandisk. asmlib also provides device persistence, which is the important benefit of ASM (see the discussion below for more details about device persistence).

But, how many times do you add disks to the servers? How many times, do we change the server or disk architectures in a given year? In my opinion, ASMLIB is an additional software layer. It is possible to setup RAC without ASMLIB and that’s the discussion of this blog.

Problem definition

Problem with devices in Linux is that the device name can change after a server reboot. That means that ASM might not come up since the device names may not be matching with asm_diskstring parameter after the reboot. Especially, since OCR and Voting disks can be stored in ASM devices from 11.2 onwards, debugging GI startup problems are painful if the device persistence is not setup properly.

How do you resolve that?

Option #1: UDEV only

UDEV eliminates the device persistence problem, provides ability to create user defined aliases, and setup device permissions. While it might seem like another new concept to learn, UDEV is quite easy to use. I will explain basic UDEV setup. Of course, I am not covering numerous options available under UDEV, and covering just necessary items.

During server startup, when the kernel detects a device ( or when a new device is added), kernel sends an event to udevd daemon. udevd daemon uses rules to match the incoming event and takes action depending upon the rule (such as remove device node, add device node etc). udevd rules have many attributes and one such attribute is that, an arbitrary program can be used in the rule to process incoming events. and that can be used to return human-friendly device names. Essentially, to setup udev we need to write udev rules.

( Since you are probably a DBA if you are reading this blog, it is easier to imagine the UDEV rule as a function. That function accepts the scsi_id and sets up human friendly aliases.)

Method to setup udev rule

  1. Modify /etc/scsi_id.config and add the following line at the end. Essentially, UDEV will assume that all SCSI devices will provide unique UUIDs.
  2. options=-g
    
  3. Identify the unique SCSI id from scsi_id command. Command scsi_id will return unique value for a given SCSCI device. (This unique value is also called UUID (or WWID) if you use SAN arrays such as EMC or Hitachi etc) Size of the device can be identified using blockdev command.
  4. ( For RHEL6, use –gud as options for scsi_id command. Looks like, option –s is replaced by –d.)

    For example, for the device /dev/sdd:
    # /sbin/scsi_id -gus /block/sdd
    3600254567259abde00006000004c0000

    To get device size, use blockdev (output in bytes):
    # /sbin/blockdev –getsize64 /dev/sdd

  5. Now, setup rules: vi /etc/udev/rules.d/99-asmdevices.rules
  6. Add a rule for the above device. A rule is essentially an if-then-else logic. In the rule, we specify is satisfied for an event, actions will be taken to setup the device. Refer to the rule printed below. If the event is a scsi device (KERNEL, BUS attributes), then call the /sbin/scsi_id -g -u -s program, passing the block device as first argument (PROGRAM attribute and %p in the rule definitin). If the RESULT of the program call matches with a value of 3600143801259abde00006000004c0000 (RESULT attribute in the rule), then create a device entry as “asmcrs01″, with owner as grid, group owner as oinstall, and permissions as 0660.

    ( one rule must be in a single line, but output below is wrapped. Make sure that rule doesn’t wrap aound in the .rules file).

    KERNEL==”sd*”, BUS==”scsi”, PROGRAM==”/sbin/scsi_id -g -u -s %p”, RESULT==”3600254567259abde00006000004c0000″, NAME=”asmcrs01″,
    OWNER=”grid”, GROUP=”oinstall”, MODE=”0660″

  7. Test the rules using udevtest
  8. # udevtest /block/sdd

    This would show that udev might create three symlinks, a symlink named/dev/asmcrs01, one symlink in /dev/disk/by-id/, and third symlink in /dev/disk/by-path/. We will use /dev/asmcrs01 symlink for ASM setup.

  9. Reload rules and start udev
  10. This should create the symlink in /dev/asmcrs01.

    # /sbin/udevcontrol reload_rules
    # /sbin/start_udev

  11. Now, setup asm_diskstring parameter to ‘/dev/asmcrs*’ so that ASM will identify these devices. Repeat the above steps for all devices that you are planning to add to ASM. You could potentially decide to perform start_udev after all rules have been setup.
  12. Once you are happy with one node setup, copy the file /etc/udev/rules.d/99-asmdevices.rules to all nodes of RAC cluster and restart udev.

Option #2: Multipathing feature

Multipathing feature provides fault tolerance for paths to storage devices and uses device mapper framework to map block devices to aliases. Even if you have just one path to the device, you could potentially setup this feature. I prefer this method at this time as it provides easier migration to multipathed devices in future.

Setup is very similar to UDEV. Here is the step-by-step instruction.

  1. Verify that device mapper rpm version is compatible.
  2. $ rpm –qa|grep device-mapper
    device-mapper-multipath-0.4.7-46.el5

  3. Verify and configure devices
  4. Verify that all SCSI devices are seen in all nodes. Note that some devices will be seen multiple times through different HBAs. Identify the SCSI devices for the database.

    # lsscsi
    # fdisk -l

  5. Modify /etc/scsi_id.config and add the following line at the end.This is for scsi_id to assume all sCSI devices will provide unique scsi id.
  6. options=-g

  7. Identify the unique SCSI Id from scsi_id command. Command scsi_id will return unique value for a given SCSI device . (This ID is also called UUID or WWID if you use SAN arrays such as EMC or Hitachi etc) Size of the device can be identified using blockdev command.
  8. ( For RHEL6, use –gud for scsi_id command. Looks like, option –s is replaced by –d.)

    For example, for the device /dev/sdd:
    # /sbin/scsi_id -gus /block/sdd
    3600254567259abde00006000004c0000

    To get device size, use blockdev (output in bytes):
    # /sbin/blockdev –getsize64 /dev/sdd

  9. Edit /etc/multipath.conf (of course take a backup of the file)
  10. a. Comment out this stanza.

    # Blacklist all devices by default. Remove this to enable multipathing
    # on the default devices. 
    #blacklist {
    #        devnode "*"
    #}
    

    b. Blacklist all local devices. Devices such as raw, loop, floppy disk etc doesn’t need to have multipathing configured. ( Remember that if you use raw device, you need to modify this procedure little bit as we are blacklisting raw devices here) .

    # Blacklist all local devices

    blacklist {
            devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
            devnode "^hd[a-z][[0-9]*]"
            devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
            devnode "dasd[a-z]+[0-9]*"
    }
    

    c. Add this stanza for specific to SAN array.

    ##
    ## Essentially, you can setup attributes specific to disk array.
    ##   This would require you to check with vendor documentation. 
    ##    In this case, we setting up for HSV200/300 HP array.
    ## This stanza defines how multipathing should behave. This is specific to a disk array but
    ## can allow to use default values too.
    devices {
            device {
                    vendor "HP"
                    product "HSV2[01]0|HSV300|HSV4[05]0"
                    getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
                    prio_callout "/sbin/mpath_prio_alua /dev/%n"
                    hardware_handler "0"
                    path_selector "round-robin 0"
                    path_grouping_policy group_by_prio
                    failback immediate
                    rr_weight uniform
                    no_path_retry 18
                    rr_min_io 100
                    path_checker tur
            }
    }
    

    d. Add following stanza. In this stanza, within a multipaths block, you would specify a device contained within another multipath block. Notice that UUID is what we got from scsi_id command earlier is used in the first part of the stanza for the device asmcrs01. We want to setup that SCSI device to have a name of asmcrs01 for the UUID 3600254567259abde00006000004c0000. This stanza will allow device mapper to create symlink as /dev/mapper/asmcrs01 for that device. Further device mapper, sets up permissions using uid/gid combination. ( Use correct uid/gid combination matching with your environment.) In this example, uid= 1100 =grid, gid=1000=oinstall. So, first multiblock stanza will create a device named /dev/mapper/asmcrs01 with permissions owned by grid:oinstall with 0660 permissions( i.e. read, write for owner and group, no permissions for other group).

    I am also setting up multiple devices below to provide an example.

    
    ##
    ## Multipathing for SCSI devices from storage array
    ##
    multipaths {
      multipath {
        wwid 3600254567259abde00006000004c0000
        alias asmcrs01
        uid 1100
        gid 1000
        mode 660
       }
      multipath {
        wwid 3601213101259abde0000600000500000
        alias asmcrs02
        uid 1100
        gid 1000
        mode 660
       }
    …
      multipath {
        wwid 3601212111259abde0000600000c00000
        alias asmdev15
        uid 1100
        gid 1000
        mode 660
       }
    }
    

    e. Enable multipath daemons and make sure that they are enabled at startup.

    # modprobe dm-multipath
    # service multipathd start
    # multipath –d
    # multipath –v2
    # multipath -v2
    create: asmcrs01 (3600254567259abde00006000004c0000) HP,HSV300
    [size=2.0G][features=0][hwhandler=0][n/a]
    \_ round-robin 0 [prio=100][undef]
    \_ 1:0:0:2 sdap 66:144 [undef][ready]
    \_ 0:0:1:2 sdv 65:80 [undef][ready]
    \_ round-robin 0 [prio=10][undef]
    \_ 1:0:1:2 sdbj 67:208 [undef][ready]
    ..
    # chkconfig multipathd on
    # chkconfig –list multipathd
    multipathd 0:off 1:off 2:on 3:on 4:on 5:on 6:off

    f. Copy /etc/multipath.conf to all remaining cluster nodes in the DB cluster. Repeat step 6 in all nodes.

    g. At this point, we have setup /dev/mapper/asm* entries. asm_diskstring should be setup to match /dev/mapper/asm*.

In essence, we can either use UDEV or Multipathing facilities to implement device persistence, and permissions without requiring ASMLIB to be setup.

Update 1:
In RHEL6/OEL6, as uid/gid permissions through multipath.conf does not work (even though documentation supports these attributes), you can overcome the issue with an udev rule:
For example:

# dmsetup ls|grep p1
asmcrs11p1 (253, 23)
asmcrs01p1 (253, 15)
asmcrs02p1 (253, 14)

#cat /etc/udev/rules.d/12-dm-permissions.rules
ENV{DM_NAME}==”asmcrs01p1″, OWNER:=”oracle”, GROUP:=”oinstall”, MODE:=”660″
ENV{DM_NAME}==”asmcrs02p1″, OWNER:=”oracle”, GROUP:=”oinstall”, MODE:=”660″
ENV{DM_NAME}==”asmcrs03p1″, OWNER:=”oracle”, GROUP:=”oinstall”, MODE:=”660″
ENV{DM_NAME}==”asmcrs04p1″, OWNER:=”oracle”, GROUP:=”oinstall”, MODE:=”660″
ENV{DM_NAME}==”asmcrs05p1″, OWNER:=”oracle”, GROUP:=”oinstall”, MODE:=”660″
ENV{DM_NAME}==”asmcrs06p1″, OWNER:=”oracle”, GROUP:=”oinstall”, MODE:=”660″

8 Responses to “Do you need asmlib?”

  1. Vyacheslav Rasskazov said

    Thank you for useful article.
    I think, needs to say that uid,gid,mode attributes in /etc/multipath.conf no longer works at RHEL/OEL 6.
    Device persistence is not mandatory condition for ASM, because ASM reads disk headers and asm_diskstring like /dev/mapper will be ok. With multipath configuration device persistence also can be achieved by setting user_friendly_names settings to “no”.

  2. nilesh nayak said

    Very Nice

  3. Hi Riyaj, please take a look on https://blogs.oracle.com/wim/entry/asmlib. Wim mentions that (if i understood properly) there is ongoing activity to have DIF/DIX/T10 storage validation done partially by ASMLib. I think it is something close to the old H.A.R.D. initiative (end-to-end data validation from RDBMS down to single drive). If that is true, then I think thing is going to be the killer feature of ASMLib+UEK combo … There is additional (older) presentation here https://oss.oracle.com/~mkp/docs/lpc08-data-integrity.pdf . The sad part about ASMLib is that it appears that Oracle officially is recommending shutting down whole CRS stack & unloading ASMLib on all nodes when you are doing kernel upgrades…

  4. Hi
    What option i have to use in RHEL 6 ? Does the /etc/multipath.conf still valid on RHEL 6?

  5. Thank you Riyaj… very good!!!

  6. Hi Riyaj!
    Nice article, however I would beg to differ. SCSI interface to the SATA disks, which is what UDEV does, and consequent using of Jorg Schiller’s libsg3 interface to do I/O on SCSI devices was invented in order to support today largely forgotten CD burners. The problem is, however, that this doubles the amount of interrupts needed to perform IO and therefore slows things down. I would recommend ASMLib, as it uses raw interface and doesn’t require additional interrupts.

    • Hello Mladen
      Thanks for reading my blog. Is there a documentation suggesting the increase in interrupts? Would you mind sharing more details about this increase in interrupts?
      Even if udev/devmapper setup doubles interrupts, I doubt that interrupts are causing noticeable, even measurable, performance issues with the use of udev/device mapper/SCSI setup. I would be glad to see any valid performance benchmark comparing asmlib and udev/device mapper setup. I heard stories of few vendors performing benchmarks comparing these two setup, but no valid benchmark results were ever released.
      Professionally, I have setup many high end Linux clusters using udev/device mapper( without ASMLIB) supporting enormous amount of workload and yet to see an issue due to udev/device mapper setup. Further, device mapper setup provides a) clean role separation as DBAs typically don’t need to be involved in disk setup b) no worries about Linux vendor support c) no worries about upgrading asmlib libraries during Linux upgrade.
      So, No, I don’t consider interrupts as a big enough not to consider device mapper, due to huge operational convenience that this provide. I guess, we will be agreeing to disagree :)
      Cheers
      Riyaj

  7. […] Oracle has its own resolution for generating block device files especially for ASM, called ASMLib. This indeed works well, but how Oracle did to generate the devices, and how to troubleshoot the generation of the devices is for me utterly confusing, and i have not found any documentation on this. I’ve managed to mess my lab servers a few times, which is the reason I don’t like ASMLib, and do not feel comfortable with ASMLib. More on that can be read on ORAinternals/Riyaj Shamsudeen […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 200 other followers