Вы находитесь на странице: 1из 10

How to Control Access to Shared Storage Devices Using SCSI Persistent Reservations with Red Hat Enterprise Linux

Clustering and High Availability


Authors: Ryan OHara Editor: Allison Pranger 11/29/2010

OVERVIEW
It is necessary to control access to storage devices when they are shared by cluster nodes. In the event of a node failure, the failed node should not have access to the underlying storage devices. SCSI persistent reservations provide the capability to control the access of each node to shared storage devices. Red Hat Enterprise Linux Clustering and High Availability employ SCSI persistent reservations as a fencing method through the use of the fence_scsi agent, which provides a way to revoke access to shared storage devices, provided that the storage devices support SCSI persistent reservations. SCSI reservations differ significantly from traditional power fencing methods. This document describes the software, hardware, and configuration requirements for using SCSI reservations as a fencing method.

Environment
Red Hat Cluster Suite 4.5+ Red Hat Enterprise Linux 5+ Advanced Platform (Clustering and GFS/GFS2) Red Hat Enterprise Linux 6+ with the High Availability Add-On or Resilient Storage Add-On

CONCEPTS
In order to understand how Red Hat Enterprise Linux is able to use SCSI persistent reservations as a fencing method, it is helpful to have some basic knowledge of SCSI persistent reservations. Three of the most important concepts are registrations, reservations, and fencing.

Registrations
A registration occurs when a node registers a unique key with a device. A device can have many registrations. For the purposes of this document, each node will create a registration on each device.

Reservations
A reservation dictates how a device can be accessed. In contrast to registrations, there can be only one reservation on a device at any time. The node that holds the reservation is know as the reservation holder. The reservation defines how other nodes may access the device. For example, fence_scsi uses a Write Exclusive, Registrants Only reservation, which indicates that only nodes that have registered with that device may write to the device.

Using SCSI Persistent Reservations with Red Hat Enterprise Linux | Ryan OHara 1

Fencing
Red Hat Enterprise Linux is able to perform fencing via SCSI persistent reservations by removing a nodes registration key from all devices. When a node failure occurs, the fence_scsi agent removes the failed nodes key from all devices, preventing it from being able to write to those devices.

REQUIREMENTS
This section describes software and storage requirements for use of SCSI persistent reservations as a fencing method. Software Requirements Use of SCSI persistent reservations as a fencing method is supported in the following environments: Red Hat Cluster Suite 4.5+ Red Hat Enterprise Linux 5+ Advanced Platform (Clustering and GFS/GFS2) Red Hat Enterprise Linux 6+ with the High Availability Add-On or Resilient Storage Add-On In addition, the sg3_utils package must be installed. This package provides the tools required to manage SCSI persistent reservations. Storage Requirements All shared storage must be SPC-3 compliant and support the preempt-and-abort sub-command. SCSI-2 devices are not supported.
NOTE: For the Red Hat Cluster Suite 4.5+ or Red Hat Enterprise Linux 5+ Advanced Platform, all shared storage must use LVM2 cluster volumes. If you are unsure if your cluster and shared storage environment meets these requirements, a script is available to determine if your shared storage devices are capable of using SCSI persistent reservations.

LIMITATIONS
In addition to the above requirements, fencing by way of SCSI persistent reservations also has some limitations: Multipath devices are currently supported only for Red Hat Enterprise Linux 5+ Advanced Platform and later with the use of device-mapper-multipath. Use with HA-LVM is not supported. All nodes in the cluster must have a consistent view of storage, meaning that all nodes in the cluster must register with the same devices. This limitation exists because each node must be able to remove another nodes registration key from all the devices with which it is registered. In order to do this, the node performing the fencing operation must be aware of all devices with which other nodes are registered. Devices used for the cluster volumes should be complete LUNs and not partitions. SCSI persistent reservations work on an entire LUN, meaning that access is controlled to each LUN, not individual partitions. In Red Hat Enterprise Linux 5.4.z and later, the fence_scsi agent can be used with two-node clusters.
Using SCSI Persistent Reservations with Red Hat Enterprise Linux | Ryan OHara 2

In Red Hat Enterprise Linux 5.4.z and later, the fence_scsi agent may be used with qdiskd. More information on this limitation can be found in the qdisk man page. If fence_scsi is used in conjunction with qdisk, the qdisk device must not be controlled by fence_scsi. If fence_scsi is configured to do automatic device detection, then the qdisk device must not be under clvmd control. If fence_scsi uses manually defined devices, the qdisk device must not be listed.

RED HAT ENTERPRISE LINUX 4.5+ CLUSTER SUITE AND RED HAT ENTERPRISE LINUX 5+ ADVANCED PLATFORM
This section describes components and configuration for SCSI persistent reservations in the Red Hat Enterprise Linux 4.5+ Cluster Suite and Red Hat Enterprise Linux 5+ Advanced Platform. For similar information regarding Red Hat Enterprise Linux 6+, see the Red Hat Enterprise Linux 6+ High Availability section below.

Components
Red Hat Enterprise Linux 5+ Advanced Platform provides three components (scripts) that can be used in conjunction with SCSI persistent reservations: fence_scsi_test, scsi_reserve init script, and fence_scsi. More information on each script is below. The fence_scsi_test Script (Red Hat Enterprise Linux 5.5 and Earlier) The fence_scsi_test script will find all devices visible to a node and report on whether those devices are compatible with SCSI persistent reservations. In Red Hat Enterprise Linux 5.5 and earlier, the fence_scsi_test script only tests if a node can create a registration with the devices. This does not completely test if storage is capable of being used with the fence_scsi agent. Specifically, fence_scsi_test does not check for support of the preempt-and-abort sub-command, which is required. Two modes are available, and you must explicitly state which mode to use by using the appropriate command-line option: Cluster Mode: Specified with the -c flag, this mode is intended for use with an existing cluster environment. This mode will discover all LVM2 cluster volumes and extract the devices within those volumes. Only devices that exist within LVM2 cluster volumes will be tested. SCSI Mode: Specified with the -s flag, this mode is intended to test all SCSI devices visible to the node. This is useful when planning the cluster volume configuration. This mode will test all SCSI devices found in the /sys/block/ directory, which may include local SCSI devices. In both modes, the script will test found devices for compatibility by attempting to register with the devices. Successful registration indicates that the device is capable of performing SCSI persistent reservations. If registration is successful, the script will remove the registration.
NOTE: If fence_scsi_test is run in Cluster Mode and reports devices that have failed the test, you must not use fence_scsi as your fencing method. If fence_scsi_test is run in SCSI Mode and reports failures for devices, those devices must not be used for shared storage (LVM2 cluster volumes) if you wish to use fence_scsi as a fencing method. Using SCSI Persistent Reservations with Red Hat Enterprise Linux | Ryan OHara 3

The fence_scsi_test Script (Red Hat Enterprise Linux 5.6 and Later) In Red Hat Enterprise Linux 5.6 and later, the fence_scsi_test script is capable of performing more thorough testing. Specifically, you can use the fence_scsi_test script to create registrations and reservations, as well as to test the preempt-and-abort sub-command. You can test without using LVM2 cluster volumes, but they are still required for use with fence_scsi itself. In addition, testing with the fence_scsi_test script requires at least two nodes to be connected to shared storage. To create registrations, run the following command from a node (node #1) that is connected to shared storage: % fence_scsi_test -o on -k 123 -d /dev/sdb,/dev/sdc In this example, the -k option specifies the key value to be used. Any key value may be used as long as it is not already being used by a different node. The -d option specifies the devices on which to create registrations (you can specify a comma-separated list of devices). If the -d option is not specified, the fence_scsi_test script will attempt to create registrations with all LVM2 cluster volumes. On a different node (node #2) that is attached to the same shared storage devices, run another command to register the same devices: % fence_scsi_test -o on -k 456 -d /dev/sdb,/dev/sdc To remove registrations, as would be done when fencing occurs via the fence_scsi agent, run the following command from a node (node #2): % fence_scsi_test -o off -k 123 -d /dev/sdb,/dev/sdc In this example, the -k option specifies the key that we want to remove. Note that this command is run on node #2 and is attempting to remove the key used by node #1. This simulates fencing via the fence_scsi agent. Again, the -d option specifies the devices on which to perform this operation. If this command is successful, the key used to register node #1 (123) should be removed from devices /dev/sda and /dev/sdb. When testing is complete, you can clear all registrations by using clear: % fence_scsi_test -o clear -d /dev/sdb,/dev/sdc Note that this command must be run from a node that is still registered with the devices. A node cannot clear registrations from a device if it is not registered with that device. In this example, only node #2 is registered with the devices, so this command must be run from node #2. See the fence_scsi_test man page for more information. The scsi_reserve Init Script When enabled, the scsi_reserve script handles creation of registrations and reservations at system startup.

Using SCSI Persistent Reservations with Red Hat Enterprise Linux | Ryan OHara 4

Once you have verified that your cluster storage is compatible and meets the requirements necessary to use fence_scsi, you can enable the scsi_reserve init script with the following command: % chkconfig scsi_reserve on The scsi_reserve init script will first generate the nodes unique key. This key is based on the cluster ID and the node ID, so it is guaranteed to be unique. The next step in the scsi_reserve script depends on which parameter is used: start, stop, or status. Each option requires that the cluster manager (CMAN) be running to extract information about the cluster and the individual node. scsi_reserve start: Running the scsi_reserve init script with the start option will create registrations on all devices that were previously discovered. If necessary, it will also create the reservation. The script will report success or failure. Success indicates that the node was capable of registering with all devices that were discovered. Failure indicates that the script was unable to register with one or more device. Should a failure occur, the cluster has no way of completely fencing a node in the event of a node failure.
NOTE: The scsi_reserve start script should be run before mounting the file system. If you already have a file system mounted and then create a reservation on any of the devices used by that file system, any node that is not registered with those devices will be unable to write to the file system.

scsi_reserve stop: When scsi_reserve is run with the stop option, it will attempt remove the nodes registration key from all the devices with which it registered at startup. Removing the registration is only a problem if that node is also the reservation holder and other nodes are still registered with the device(s). If that is the case, the node will not be able to remove its registration since doing so would also release the reservation. The script will report failure when attempting to remove a nodes registration if it is the reservation holder and other registrations exist. scsi_reserve status: When the scsi_reserve script is run with the status option, it will list the devices with which the node is registered. The fence_scsi Scrpit The fence_scsi script is the actual fence agent that is run when node failure occurs. This script is typically invoked by the fence domain and not manually run. The fence_scsi script removes the specified nodes registrations from all devices, preventing write access. If the node being fenced is also the reservation holder, the node that is performing the fence operation will become the new reservation holder. Using this script will not remove the node from the cluster.
NOTE: If the node being fenced has the file system mounted, removing its registrations prevents the node from accessing the file system. This sudden inability to access the devices upon which the file system exists might result in I/O errors and a subsequent withdraw from the file system. This behavior is expected.

Configuration
Below is a sample configuration (cluster.conf) for a cluster that uses SCSI persistent reservations as its fencing method. Note that each node defines its fence device and passes its node name to the agent via the node parameter.
Using SCSI Persistent Reservations with Red Hat Enterprise Linux | Ryan OHara 5

Each node explicitly defines its nodeid. This is required for all clusters that use fence_scsi as the fencing method so that the various SCSI reservation scripts can predictably generate the nodes unique registration key. <?xml version="1.0"?> <cluster config_version="1" name="my_cluster"> <fence_daemon post_fail_delay="0" post_join_delay="30"/> <clusternodes> <clusternode name="node-01" votes="1" nodeid="1"> <fence> <method name="scsi"> <device name="fence_dev" node="node-01"/> </method> </fence> </clusternode> <clusternode name="node-02" votes="1" nodeid="2"> <fence> <method name="scsi"> <device name="fence_dev" node="node-02"/> </method> </fence> </clusternode> <clusternode name="node-03" votes="1" nodeid="3"> <fence> <method name="scsi"> <device name="fence_dev" node="node-03"/> </method> </fence> </clusternode> </clusternodes> <cman cluster_id="1234"/> <fencedevices> <fencedevice agent="fence_scsi" name="fence_dev"/> </fencedevices> <rm> <failoverdomains/> <resources/> </rm> </cluster>

RED HAT ENTERPRISE LINUX 6+ HIGH AVAILABILITY


This section describes configuration for SCSI persistent reservations in Red Hat Enterprise Linux 6+ with the High Availability Add-On. For similar information regarding earlier versions, see the Red Hat Enterprise Linux 4.5+ Cluster Suite and Red Hat Enterprise Linux 5+ Advanced Platform section above.

Unfencing
In addition to the standard fence section, fence_scsi in Red Hat Enterprise Linux 6+ requires an additional unfence section in each clusternode entry. The unfence section is used at cluster startup to create registrations on all devices. It must contain a device entry that is almost identical to the device entry in the fence section, except the unfence section must have action=on in its device entry.
Using SCSI Persistent Reservations with Red Hat Enterprise Linux | Ryan OHara 6

Below is an example of the fence and unfence sections. <clusternode name="node-01" votes="1" nodeid="1"> <fence> <method name="scsi"> <device name="scsi_dev" key="1"/> </method> </fence> <unfence> <device name="scsi_dev" key="1" action="on"/> </unfence> </clusternode> In the above example, the nodes key parameter is manually defined. If key values are manually defined, they must be defined in both the fence and unfence elements, and the key values must be equivalent. For more information about manually defined key values, see the section below.

Keys
SCSI persistent reservations use unique key values for registrations and reservations. These key values can be either generated automatically or manually defined in the cluster.conf file. To manually define key values, use the key parameter within the device sections for each clusternode entry. This key value must be provided in both the fence and unfence sections of the configuration file, and the value must be the same within each clusternode. Below is an example where keys are defined manually for a node. <clusternode name="node-01" votes="1" nodeid="1"> <fence> <method name="scsi"> <device name="scsi_dev" key="1"/> </method> </fence> <unfence> <device name="scsi_dev" key="1" action="on"/> </unfence> </clusternode> In the above example, the node will use a key value of 1. Note that the key value is given in the fence and unfence sections. All other node configurations should use the same format, except that they must use different key values. The key value can be any hexadecimal value up to 64 bits. To have the cluster automatically define key values, do not use the key parameter. When the key parameter is not defined, key values will be generated by combining the cluster_id and the nodeid, which are retrieved using the cman_tool command. This value is guaranteed to be unique and consistent for all nodes in the cluster.

Devices
The devices to be used with fence_scsi can be either discovered automatically or manually configured. To manually define the devices to be used with fence_scsi, use the devices parameter within the
Using SCSI Persistent Reservations with Red Hat Enterprise Linux | Ryan OHara 7

fencedevice section. The devices parameter should be a comma-separated list of block devices. The devices listed here will receive SCSI persistent reservations commands (registrations and reservations), so each device listed must be SPC-3 compliant. Below is an example where devices are defined manually. <fencedevices> <fencedevice agent="fence_scsi" name="scsi_dev" devices="/dev/disk/by-id/scsi-360019b9000c109fe00000fd44a7824d1, /dev/disk/by-id/scsi-360019b9000c109fe00000fd64a78250f, /dev/disk/by-id/scsi-360019b9000c109fe00000fd84a78254b"/> </fencedevices> In the above example, there are three devices defined to be used with fence_scsi. Each device is identified by its SCSI ID, which maps to the corresponding physical device.
NOTE: Specifying devices by physical device name (for example, /dev/sda) can be problematic and should be avoided because a specific device might have a different name on other node. For example, the device named /dev/sdb on one node might be named /dev/sdc on another node. To avoid this problem, it is recommended that devices be listed by SCSI ID (its /dev/disk/by-id/<id> path).

To have the cluster automatically discover the devices to be used with fence_scsi, do not use the devices parameter. In the absence of this parameter, the fence_scsi agent will use all devices that belong to LVM volume groups that have the cluster attribute. All devices within these cluster volume groups will be used, and each device must be SPC-3 compliant.

APTPL
The fence_scsi agent can be configured to optionally use the activate persist through power loss (APTPL) flag option. This feature requires that the storage supports APTPL. To enable this feature, simply set the aptpl parameter in the fencedevice section of the cluster.conf file. Below is an example where the APTPL option is enabled. <fencedevices> <fencedevice agent="fence_scsi" name="scsi_dev" aptpl="1"/> </fencedevices>

Logging
The fence_scsi agent can be configured to write detailed logging information to a specific file. To enable logging to file, set the logfile parameter in the fencedevice section of the cluster.conf file. Below is an example where logging to a file is enabled. <fencedevices> <fencedevice agent="fence_scsi" name="scsi_dev" logfile="/var/log/cluster/fence_scsi.log"/> </fencedevices>
NOTE: The preferred log file is /var/log/cluster/fence_scsi.log since that location contains all other cluster log files.

Using SCSI Persistent Reservations with Red Hat Enterprise Linux | Ryan OHara 8

Examples
Below is a sample configuration for a three-node cluster that uses SCSI persistent reservations as its fencing method. In this example, both keys and devices are manually defined, and the APTPL flag is set in the fencedevice section. <clusternodes> <clusternode name="node-01" votes="1" nodeid="1"> <fence> <method name="scsi"> <device name="scsi_dev" key="1"/> </method> </fence> <unfence> <device name="scsi_dev" key="1" action="on"/> </unfence> </clusternode> <clusternode name="node-02" votes="1" nodeid="2"> <fence> <method name="scsi"> <device name="scsi_dev" key="2"/> </method> </fence> <unfence> <device name="scsi_dev" key="2" action="on"/> </unfence> </clusternode> <clusternode name="node-03" votes="1" nodeid="3"> <fence> <method name="scsi"> <device name="scsi_dev" key="3"/> </method> </fence> <unfence> <device name="scsi_dev" key="3" action="on"/> </unfence> </clusternode> </clusternodes> <fencedevices> <fencedevice agent="fence_scsi" name="scsi_dev" aptpl="1" devices="/dev/disk/by-id/scsi-360019b9000c109fe00000fd44a7824d1, /dev/disk/by-id/scsi-360019b9000c109fe00000fd64a78250f, /dev/disk/by-id/scsi-360019b9000c109fe00000fd84a78254b" logfile="/var/log/cluster/fence_scsi.log"/> </fencedevices> </cluster> Below is a sample configuration for a three-node cluster that uses SCSI persistent reservations as its fencing method. In this example, each nodes key value is generated using cluster_id and nodeid. Devices are discovered by finding all devices that exist in LVM volume groups that have the cluster parameter set.

Using SCSI Persistent Reservations with Red Hat Enterprise Linux | Ryan OHara 9

<clusternodes> <clusternode name="node-01" votes="1" nodeid="1"> <fence> <method name="scsi"> <device name="scsi_dev"/> </method> </fence> <unfence> <device name="scsi_dev" action="on"/> </unfence> </clusternode> <clusternode name="node-02" votes="1" nodeid="2"> <fence> <method name="scsi"> <device name="scsi_dev"/> </method> </fence> <unfence> <device name="scsi_dev" action="on"/> </unfence> </clusternode> <clusternode name="node-03" votes="1" nodeid="3"> <fence> <method name="scsi"> <device name="scsi_dev"/> </method> </fence> <unfence> <device name="scsi_dev" action="on"/> </unfence> </clusternode> </clusternodes> <fencedevices> <fencedevice agent="fence_scsi" name="scsi_dev" logfile="/var/log/cluster/fence_scsi.log"/> </fencedevices> </cluster>

REFERENCES
For more information regarding the sg3_utils package, see http://sg.danny.cz/sg/sg3_utils.html.

Using SCSI Persistent Reservations with Red Hat Enterprise Linux | Ryan OHara 10
Copyright 2011 Red Hat, Inc. Red Hat, Red Hat Linux, the Red Hat Shadowman logo, and the products listed are trademarks of Red Hat, Inc., registered in the U.S. and other countries. Linux is the registered trademark of Linus Torvalds in the U.S. and other countries.

www.redhat.com

Вам также может понравиться