Вы находитесь на странице: 1из 6

Main Menu | Hints and tips for a storage area network (SAN)

Hints and tips for a storage area network (SAN)

For problems with a storage area network, there are many possible causes. SAN
problems may be with software on the machine trying to use the device, the
connections to the device, or the device itself.

The first question to ask whenever a SAN problem is encountered is "Has anything
been changed?" Changes anywhere between the machine trying to use the device and
the device itself may be suspect, especially if the device worked prior to a given
change and stopped working after that change.

To better understand the following discussion on diagnosing problems with a SAN,


review the following terminology and typical abbreviations that are used:

Fibre channel
Fibre channel denotes a fibre-optical connection to a device or component.
This is typically abbreviated as FC.
Host bus adapter
A host bus adapter is used by a given machine to access a storage area
network. A host bus adapter is similar in function to a network adapter and
how it provides access for a machine to a local area network or wide area
network. This is typically abbreviated as HBA.
Storage area network
A storage area network is a network of shared devices that can typically be
accessed using fibre. Often, a storage area network is used to share devices
between many different machines. This is typically abbreviated as SAN.

The following should be considered when evaluating problems with a SAN:

Diagnosing a SAN
• Know your SAN configuration
• Supported devices
• Considerations for a HBA
• HBA configuration issues
• FC switch configuration issues
• Verify data gateway port settings
• Verify the SAN configuration between devices

• Monitor the fibre channel link error report

Know your SAN configuration

Understanding the SAN configuration is critical in SAN environments. Various SAN


implementations have limitations or requirements on how the devices are configured
and set up.

The three SAN configurations are:


Point to point
This is the simplest configuration. The devices are connected directly to the
HBA.
Arbitrated loop
Arbitrated loop topologies are ring topologies and are limited in terms of the
number of devices that are supported on the loop and the number of devices
that can be in use at a given time. In an arbitrated loop, only two devices can
communicate at the same time. Data being read from a device or written to a
device is passed from one device on the loop to another until it reaches the
target device. The main limiting factor in an arbitrated loop is that only two
devices can be in use at a given time.
Switched fabric
In a switched fabric SAN, all devices in the fabric will be fibre native devices.
This topology has the greatest bandwidth and flexibility because all devices
are available to all HBAs through some fibre path.

Return to diagnosing a SAN

Device supported

Many devices or combinations of devices may not be supported in a given SAN.


These limitations arise from the ability of a given vendor to certify their device using
Fibre Channel Protocols.

For a given device, verify with the device vendor that it is supported in a SAN. This
includes whether or not it is supported by the HBAs used in your SAN environment,
which means verifying with the vendors of the hubs, gateways, and switches that
make up the SAN that this device is supported.

Return to diagnosing a SAN

Considerations for a host bus adapter (HBA)

The HBA is critical device for the proper functioning of a SAN. Problems that arise
relating to HBAs range from improper configuration to outdated bios or device
drivers.

For a given HBA, check the following:

BIOS
Host bus adapters have an imbedded BIOS that can be updated. The vendor
for the HBA will have utilities for updating the BIOS in an HBA. Periodically,
the HBAs in use on your SAN should be checked to see if there are BIOS
updates that should be applied.
Device driver
Host bus adapters use device drivers to work with the operating system to
provide connectivity to the SAN. The vendor will typically provide a device
driver for use with their HBA. Similarly, the vendor will provide instructions
and any necessary tools or utilities for updating the device driver. Periodically
the device driver level should be compared to what is available from the
vendor and if needed, it should be updated to pick up the latest fixes and
support.
Configuration
Host bus adapters typically have a number of configurable settings. The
following settings typically affect how TSM functions with a SAN device:

Return to diagnosing a SAN

HBA configuration issues

Host bus adapters typically have many different configuration settings and options.

The vendor for the HBA should provide information about the settings for your HBA
and the appropriate values for these settings. Similarly, the HBA vendor should
provide a utility and other instructions on how to configure your HBA. The settings
that typically affect using TSM with a SAN are:

SAN topology
The HBA should be set appropriately based on the SAN topology being used.
For example, if your SAN is an arbitrated loop, the HBA should be set for this
configuration. If the HBA connects to a Switch, this HBA port should be set to
"point to point" and not "loop".

With TSM SAN Device Mapping perform SAN discovery on most of the
platforms, the persistent binding of the devices are not required. TSM server
can find the device if the device path has been changed due to reboot or other
reasons.

Go to http://www.ibm.com/support to verify the platform/HBA vendor/driver


level support for TSM SAN discovery.

FC link speed
In many SAN topologies, the SAN can be configured with a maximum speed.
For example, if the FC switch maximum speed is 1GB/sec, the HBA should
also be set to this value. Or the HBA should be set for automatic (AUTO)
negotiation if the HBA supports this capability.
Is fibre channel tape support enabled?
TSM requires that an HBA is configured with tape support. TSM typically
uses SANs for access to tape drives and libraries. As such, the HBA setting to
support tapes must be enabled.

Return to diagnosing a SAN

FC switch configuration issues

A FC switch typically supports many different configurations. The ports on the switch
need to be configured appropriately for the type of SAN that is setup as well as the
attributes of the SAN.
The vendor for the switch should provide information about the appropriate settings
and configuration based upon the SAN topology being deployed. Similarly, the switch
vendor should provide a utility and other instructions on how to configure it. The
settings that typically affect how TSM uses a switched SAN are:

FC link speed
In many SAN topologies, the SAN can be configured with a maximum speed.
For example, if the FC switch maximum speed is 1GB/sec, the HBA should
also be set to this value. Or the HBA should be set for automatic (AUTO)
negotiation if the HBA supports this capability.
Port mode
The ports on the switch need to be configured appropriately for the type of
SAN topology being implemented. For example, if the SAN is an arbitrated
loop, the port should be set to FL_PORT. For another example, if the HBA is
connected to a Switch. The HBA options should be set to point-to-point and
not to loop.

Return to diagnosing a SAN

Verify data gateway port settings

A data gateway in a SAN translates fibre channel to SCSI for SCSI devices attached
to the gateway. Because data gateways are popular in SANs because they allow the
use of SCSI devices, it is important that the port settings for a data gateway are
correct.

The vendor for the data gateway should provide information about the appropriate
settings and configuration based upon the SAN topology being deployed and SCSI
devices being used. Similarly, the vendor should provide a utility and other
instructions on how to configure it. The following settings can be used for the FC port
mode on the connected port on a data gateway:

Private target
Only the SCSI devices attached to the data gateway are visible and usable
from this port. For the available SCSI devices, the gateway simply passes the
frames to a given target device. Private target port settings are typically used
for arbitrated loops.
Private target and initiator
Only the SCSI devices attached to the data gateway are visible and usable
from this port. For the available SCSI devices, the gateway simply passes the
frames to a given target device. As an initiator, this data gateway may also
initiate and manage data movement operations. Specifically, there are
extended SCSI commands that allow for third-party data movement. By
setting a given port as an initiator, it is eligible to be used for third-party data
movement SCSI requests.
Public target
All SCSI devices attached to the data gateway as well as other devices
available from the fabric are visible and usable from this port.
Public target and initiator
All SCSI devices attached to the data gateway as well as other devices
available from the fabric are visible and usable from this port. As an initiator,
this data gateway may also initiate and manage data movement operations.
Specifically, there are extended SCSI commands that allow for third-party
data movement. By setting a given port as an initiator, it is eligible to be used
for third-party data movement SCSI requests.

Return to diagnosing a SAN

Verify the SAN configuration between devices.

Devices in a SAN, such as a data gateway or a switch, typically provide utilities that
display what that device sees on the SAN. It is possible to use these utilities to better
understand and troubleshoot the configuration of your SAN.

The vendor for the data gateway or switch should provide a utility for configuration.
As part of this configuration utility, there is typically information about how this
device is configured and other information that this device sees about the SAN
topology that it is apart of. These vendor utilities can be used to verify the SAN
configuration between devices:

Data gateway
A data gateway should report all the FC devices as well as the SCSI devices
that are available in the SAN.
Switch
A switch should report information about the SAN fabric.
TSM Management Console
The TSM management console will display device names and the paths to
those devices. This can be useful to help verify that the definitions for TSM
match what is actually available.

Return to diagnosing a SAN

Monitor the fibre-channel-link error report

Most SAN devices provide monitoring tools that can be used to report information
about errors and performance statistics.

The vendor for the device should provide a utility for monitoring. If a monitoring tool
is available, it will typically report errors also. The errors that are often seen are:

CRC error, 8b/10b code error, and other similar symptoms


These are usually recoverable errors. The error handling for these cases is
usually provided by firmware or hardware. In most cases, the recovery by the
device is to have the failing frame retransmitted. The FC link is still active
when these errors are encountered. Applications using a SAN device that
encounters this type of link error usually are not even aware of the error unless
it is a solid error. A solid error is one where the firmware and hardware
recovery was not able to successfully retransmit the data after repeated
attempts. The recovery for these types of errors is typically very fast and will
not cause system performance to be degraded.
Link failure (loss of signal, loss of synchronization, NOS primitive received)
This indicates that a link is actually "broken" for a period of time. It is likely a
faulty gigabit interface connector (GBIC), media interface adapter (MIA) or
cable. The recovery for this type error is disruptive. This error will be surfaced
to the application using the SAN device that encountered this link failure. The
recovery is at the command exchange level and involves the application and
device driver having to perform a reset to the firmware and hardware. This
will cause the system to run degraded until the link recovery is complete.
These errors should be monitored closely as they typically will affect multiple
SAN devices. Note that often times these errors are caused by a CE action to
replace a SAN device. As part of the maintenance performed by the CE to
replace or repair a SAN device, the fibre cable was temporarily disconnected.
If this was the case, the time and duration of the error should correspond to
when the service activity was performed.

Return to diagnosing a SAN