Академический Документы
Профессиональный Документы
Культура Документы
Interoperability of Fibre Channel products has also been encouraged by the Fibre
Channel Industry Association (FCIA) and the Storage Networking Industry Association
(SNIA).
GBICs are removable media and so provide flexibility for configuring either optical or
copper cabling.
GBICs are removable media and so provide flexibility for configuring either optical or
copper cabling.
The selection of GBICs for SAN interconnection is as important as the selection of the
proper switch or hub, because without signal integrity at the link level, no higher
functions can be performed.
GBICs are manufactured to a de facto standard form factor, and that enables use of
various vendors' transceivers in the same device chassis.
First-generation optical GBICs used edge-emitting CD lasers—that is, laser components normally
used in consumer CD products.
For high-speed data transport, only the highest-quality CD lasers can be used.
Wafer manufacturing techniques for CD lasers require separation of the discrete components for
testing to ensure quality, and that contributes to manufacturing overhead and cost.
In addition, compared with other kinds of lasers, CD lasers consume more power, radiate more heat,
and are not tolerable to loss of calibration over time.
These inherent problems of CD technology have encouraged the development of alternative laser
products—in particular, Vertical Cavity Surface Emitting Lasers, or VCSELs.
ME - CSE [SAN- Unit III] 5
VCSELs also consume less power, radiate less heat, and maintain calibration better than
their CD cousins.
VCSELs can be tested at the wafer, as opposed to the discrete component level,
manufacturing and quality testing can be performed more efficiently.
GBIC manufacturers have converted from CD lasers to VCSELs, and that helps to
ensure that these physical-layer components will maintain calibration and therefore data
integrity at the system level.
This implies that the hub, switch, or HBA can query the GBIC and report its findings to a
management applications.
Concurrent with the development of 2Gbps Fibre Channel, vendors have achieved higher
port density in the same chassis footprint by substituting new small form factor
transceivers in place of the traditional GBIC.
Small form factor transceivers provide the same functionality as GBICs but in a more
compact package.
Small form factor transceivers may be fixed (permanently attached to the device port).
The pluggable variety is typically referred to as SFP (small form factor pluggable).
HBAs are available for various bus types and various physical connections to the
transport.
Most commonly employed are HBAs with Peripheral Component Interface (PCI) bus
interfaces and shortwave fiber-optic transceivers.
HBAs are supplied with software drivers to support various operating systems and
upper-layer protocols as well as support for private loop, public loop, and fabric
topologies.
Although most HBAs have a single transceiver for connection to the SAN, some dual-
ported and even quad-ported HBAs exist.
Multi-ported HBAs save bus slots by aggregating N_Ports, but they also pose a potential
single point of failure should the HBA hang.
Most HBAs offer a single Fibre Channel port, requiring that you install additional HBAs
if you desire multiple links to the same or different SAN segment.
ME - CSE [SAN- Unit III] 9
Fig: Host bus adapter functional diagram
For fiber optics, this connector may be a standard GBIC, a fixed transceiver, or a small
form factor transceiver.
For copper interface, the connector may be DB-9 with four active wires or the high-
speed serial direct connect (HSSDC) form factor.
Behind the link interface, clock and data recovery (CDR) circuitry,
serializing/deserializing functions, and an elasticity buffer and retiming circuit enable the
receipt and transmission of gigabit serial data.
The FC-1 transmission protocol requirements are met with on-board 8b/10b encoding
logic for outbound data and decoding logic for incoming and error monitoring functions.
For loop-capable HBAs, the FC-1 functions must be followed by a loop port state
machine (LPSM) circuit, typically included with other features in a single chip.
The HBA provides the signaling protocol for frame segmentation and reassembly, class
of service, and credit algorithms as well as link services for fabric port login required by
FC-2.
ME - CSE [SAN- Unit III] 11
The FC-4 upperlayer protocol mapping level, most HBAs provide SCSI-3 software
drivers for NT, UNIX, Solaris, or Macintosh operating systems.
The number of functions consolidated into one or more chips is vendor-dependent, but
current HBAs are ASIC-based and collapse most functions into an integrated architecture.
This is an important feature, because compatibility issues and the need for microcode
fixes are facts of life for all network products.
Having a means to upgrade microcode via a software utility is very useful and extends
the life of the product.
In most cases, installing new microcode or device drivers requires taking an HBA off
line, and that is additional incentive for redundant configurations in high-availability
networks.
The SCSI-3 device driver supplied by the HBA vendor is responsible for mapping Fibre
Channel storage resources to the SCSI bus/target/LUN triad required by the operating
system
ME (OS).
- CSE [SAN- Unit III] 12
These SCSI address assignments may be configured by OS utilities or by a graphical
interface supplied by the manufacturer (or both).
Because Fibre Channel addresses are self-configuring, the mapping between port
addresses or AL_PAs (which may change) and the upper-layer SCSI device.
Device drivers for IP over Fibre Channel must perform a similar function via the
Address Resolution Protocol (ARP).
Most configurations assume that all IP-attached devices reside on the same IP subnet
and that no IP router engine exists in the fabric to which it is connected.
If a SAN design requires concurrent use of IP and SCSI-3, some vendors require a
separate card for each protocol.
Some host bus adapters offer add-on features such as HBA-based RAID, which offloads
the task of striping of data across multiple drives from the server's CPU.
RAID standards define methods for storing data to multiple disks and imply intelligence
in the form of a RAID controller.
A server is connected to a single disk drive, reads or writes of multiple data blocks are
limited by the buffering capability and rotation speed of the disk.
RAID is called level 0, although it boosts performance, it does not provide data security.
If a single disk fails, data cannot be reconstructed from the survivors.
RAID level 5 : addresses this problem by striping block-level parity information across
each drive.
RAID level 1 : achieves full data security by sacrificing the performance gain of striping
in favor of simple disk mirroring.
RAID provides for redundancy and speed can be implemented by a dedicated RAID
controller housed in the same enclosure as the disks, or by a RAID controller provisioned
in the host system or file server.
Data is passed from the operating system to the RAID controller, which then manages
the striping or mirroring task.
Fibre Channel-attached RAID enclosures use an integrated RAID controller that sits
behind the Fibre Channel interface.
The RAID controller appears as an N_Port or NL_Port to the outside world but can use a
proprietary bus, a SCSI bus, or other architecture to talk to its drives.
RAID manufacturers can also incorporate Fibre Channel technology behind the RAID
controller.
The use of arbitrated loop between the RAID controller and its Fibre Channel disks is
invisible to the SAN, because the RAID controller still appears as a single N_Port or
NL_Port to the rest of the topology.
The RAID controller and all its storage appear as a single N_Port on its own 100MBps
segment.
you can attach more than a terabyte of storage without consuming all available AL_PAs.
Fig: JBOD disk configuration with primary and secondary loop access
The JBOD's interface is connected to a Fibre Channel hub or switch, the connection is
not to a single Fibre Channel device but to multiple independent loop devices within the
enclosure.
If the connection is made to a switch, the switch port must be an FL_Port because the
downstream enclosure is actually a loop segment.
If the connection is to an arbitrated loop hub, the population of the entire loop is
increased by the number of drives in the JBOD.
JBODs may also include options for configuring the backplane to support dual loops to a
single set of drives.
At the SCSI-3 level, commands queue to the targets and await response from each one
before more frames can be sent.
JBOD enclosures are typically marketed with eight to ten drive bays, some of which may
be configured for failover.
Some very large disk arrays are packaged in 19-inch rack form factors, with more than
20 disks per JBOD module and as many as four modules per rack enclosure.
These high-end systems may include rack-mounted arbitrated loop hubs or switches for
connecting JBOD modules to one another and to servers for single- or dual-loop
configurations.
Given the amount of customer data that is stored on such arrays, it is preferable to use
managed hubs or fabric switches rather than unmanaged interconnects.
Dual power supplies, redundant fans, and SES management options allow JBODs to be
used in high-availability environments.
You can begin with a partially populated JBOD enclosure and add disks as storage needs
dictate.
A RAID controller option can then be added to increase performance and offload
software RAID tasks from the host.
Hubs are available from a number of vendors in various port configurations, interface
types, and levels of management.
Unlike other Fibre Channel devices, a hub is a passive participant in the SAN topology.
wire arbitrated loops into a ring by connecting the transmit lead of one device to the
receive lead of another and extending this scheme until all devices are physically
configured into a loop.
loop hubs simplify the cable plant by concentrating physical links at a central
location, and they minimize disruption by providing bypass circuitry at each port.
ME - CSE [SAN- Unit III] 24
Fig: Wiring concentration with an arbitrated loop hub
Hubs are usually co-located with storage arrays or servers in 19-inch equipment racks,
and that offers a further convenience for verifying status and cabling within the
enclosure.
Large storage arrays may have multiple JBODs or RAIDs and multiple loop hubs
configured into separate or redundant loops within a single 19-inch enclosure.
Hub design varies from vendor to vendor, but all hubs incorporate basic features
specific to arbitrated loop.
A hub embodies the loop topology by completing circuits through each port and
then joining the transmit of the last port to the receive of the first.
If the signaling is too fast or too slow, the port will remain in bypass mode.
a port in bypass mode shunts the bit stream it has received from its upstream
neighbor directly to its immediate downstream neighbor.
Some vendor implementations turn off the transmitter as long as the port is in
bypass mode.
The transmitter is then enabled only when the hub port receives valid signal
from a newly attached device.
Loop hubs may offer one or more light-emitting diodes (LEDs) per port to
display port status.
A green LED to indicate a link connection, and an amber LED to indicate the
current bypass mode.
Some implementations use a single, multicolored LED per port, thereby reducing
the number of diagnostic display states available.
Port LEDs give the operator an at-a-glance status of a device's connection state,
and by simplifying troubleshooting they help reduce down time.
Various hub products are available that are engineered to different marketing
requirements, from simple entry-level unmanaged hubs to hubs with advanced diagnostics
on-board.
If only a single server and one or two disk arrays are involved, there is less
exposure to prolonged down time if a cable or other component fails.
They typically provide port bypass circuitry, based on valid signaling alone and
port LEDs to display insertion or bypass status.
If an attached device is unplugged or powered off, an unmanaged hub will auto-
bypass the port and light the appropriate port LEDs.
Without hub management to report the backup loop failure, the redundancy
would no longer be in force, and the failure of the primary loop would bring
the system down.
ME - CSE [SAN- Unit III] 30
Troubleshooting would then be complicated by the fact that what appeared to
be a single occurrence—system failure— was actually the result of two
separate events on two separate loop topologies.
Unmanaged hubs are a logical choice for low-cost storage network solutions in
which economical servers and small JBODs are used to meet budget restraints.
These entry-level systems bring SAN capability within reach of small business
and departmental applications.
Managed Hubs:
At the low end of the managed hub offering, basic hub status and port controls
are available via Web browser, Telnet, or SNMP (Simple Network
Management Protocol) management software.
These functions are the minimum requirement for hub management, although
some implementations concentrate on port status alone.
Second, for these circuits to be useful, the hub must be able to report to and
accept commands from an external management workstation.
This is typically accomplished via an Ethernet port on the hub, over which
SNMP queries and commands are sent from an NT or UNIX console.
The application software used to manage a hub is provided by the hub vendor,
either as a stand-alone program or as a utility that can be launched from more
comprehensive SAN management applications.
Managed hub products that support SNMP or Web browsers can be managed
from anywhere in an IP routed network.
ME - CSE [SAN- Unit III] 32
Fig : Managed SAN with remote SNMP console
The event log may be queried from the hub using Telnet or SNMP, or it may be
replicated on the management workstation for a permanent record of activity.
switching hubs provide the simplicity of arbitrated loop with the high-performance
bandwidth of switches.
switching hubs are not fabric capable and so do not support fabric login, SNS, or state
change notification.
This reduces the requirements and additional cost associated with fabric switches.
configurations that require high bandwidth but no more than 126 total devices, switching
hubs offer a reasonable price and performance solution.
Switching hubs typically provide 6 to 12 ports, each of which supports 1Gbps or 2Gbps
throughput.
The attached loop nodes are configured into one virtual loop composed of multiple loop
segments.
Switching hubs support SNMP, SCSI Enclosure Services, or other management features.
Depending on the vendor's design, some products offer advanced diagnostic features,
including the ability to direct, via a management interface, data capture traffic on any
other port without disrupting the topology.
Cascading fabrics via expansion ports (E_Ports) allows small and medium
configurations to expand as SAN requirements grow, it may become potential bottlenecks
for fabric-to-fabric communication.
This resolves the congestion issue but may introduce another problem if the source and
destination N_Ports on either side receive out-of-order frames.
In addition to bandwidth, switch-to-switch latency may limit the number of switches in a
path.
The standards do not define how, specifically, these port types are to be implemented in
hardware, so vendor designs may differ.
Some products offer a modular approach, with separate port cards for each port type.
Others provide ports that can be configured via management software or auto-
configuration to support any port type.
The latter offers more flexibility than the others for changing SAN topologies,
permitting redistribution or addition of devices with minimal disruption.
All fabric switches support some variation of zoning. Port zoning allows a port to be
assigned to an exclusive group of other ports.
Fabric management graphical interfaces may include topology mapping, enclosure and
port statistics, routing information, and port performance graphing.
Departmental fabric switches may have redundant power supplies and swappable
fans, but they do not provide the high availability features of director-class fabric
switches.
A vendor data sheet for a departmental switch may declare support for as many as
239 switches in a fabric, practical guidelines typically call for no more than 32
switches, with no more than 7 switch-to-switch hops in any path.
Directors may provide 64 to 128 or more ports (256 ports for some announced
products) and so present a streamlined solution for the storage requirements of large
data centers.
Fibre Channel director architecture implies high availability for every component,
including redundant processors, routing engines, backplanes, and hot-swappable port
cards.
Trunked interswitch links between directors can reduce potential blocking, and the
port fan-out supplied by departmental switches increases the total population that can
be reasonably supported.
ME - CSE [SAN- Unit III] 43
Fig: Combining directors and departmental switches in a fabric
Fibre Channel-to-SCSI bridges normally provide one or two Fibre Channel interfaces
for SAN attachment, and two to four SCSI ports for SCSI disk arrays or tape backup
subsystems.
In addition to this physical and transport conversion, the Fibre Channel-to-SCSI bridge
translates serial SCSI-3 protocol to the appropriate SCSI protocol required by the legacy
devices.
The Fibre Channel-to-SCSI products, however, do not actually route data at layer 3;
instead, they simply convert one form of SCSI to another. In other words, they provide a
bridging function.
The most common application for Fibre Channel-to-SCSI bridges is to support legacy
tape backup subsystems.
Fibre Channel-to-SCSI bridges may therefore both preserve your investment in tape
subsystem hardware and satisfy the need to optimize the tape backup process itself.
ME - CSE [SAN- Unit III] 45
Fig: A Fibre Channel-to-SCSI bridge supporting SCSI-attached tape
Because the tape subsystem is now addressable by any server on the storage network, all
servers can share what was previously a dedicated resource.
Dark fiber is any unused optical pair of an installed cable run. In contrast, lit fiber
is already carrying Ethernet, ATM, or some other transport.
DWDM can carry any protocol but has become linked to Fibre Channel extension
because of its ability to drive longer distances more efficiently than traditional long
wave optics and single-mode cable.
Fibre Channel fabric switches were originally designed for data center
applications within a fairly narrow circumference (500m with multimode cabling
and shortwave optics).
Fibre Channel extension with DWDM can be used for either switch-to-switch or
switch-to-node connectivity.
From the standpoint of the fabric switch, the intervening DWDM equipment and
long haul cable plant are transparent.
As long as the link is stable, this stretched E_Port connection presents no major
difficulties. If the link fails, however, the effect is the same as pulling the cable between
two adjacent fabric switches.
Each would undergo fabric reconfiguration, with all storage conversations suspended
until each fabric stabilized
The more common and more affordable IP network services is an attractive option
for Fibre Channel extension.
Alternatively, two native IP storage protocols (iFCP and iSCSI) can also be used to
extend Fibre Channel-originated traffic.
FCIP device typically attaches by E_Port connection to the source Fibre Channel
switch.
Fibre Channel frames destined for the remote end are wrapped in IP datagrams and
sent across the IP network.
At the receiving end, the IP datagrams are stripped off, and the original Fibre
Channel frames are delivered to the E_Port of the receiving fabric switch for routing.
FCIP tunneling is suitable for applications that require connectivity only between
two sites.
A pair of FCIP devices is therefore required for each remote link, and each remote
site assumes the existence of a fabric switch in addition to end nodes.
A Fibre Channel WAN bridge transports Fibre Channel traffic over non-Fibre
Channel topologies such as ATM.
Connection to a local fabric is via a B_Port (bridge port) interface on the bridge.
WAN bridging between two Fibre Channel SANs may result in a single fabric
spread over distance.
The WAN bridges pass E_Port traffic between fabric switches, and a common
address space with a unique Domain_ID is established for the dispersed SAN.
The WAN bridge can engage in E_Port behavior with its local fabric switch, creating
an autonomous Fibre Channel region.
High-availability storage access via server clustering, rationalized data backup and
restore, disk-to-disk data replication, and file sharing utilities—all these are predicated on
a SAN infrastructure that provides peer-to-peer connectivity.
Server Clustering:
Server clustering involves several functional components, including failover, load
sharing, and common data access.
Each server is tied to its own storage via parallel SCSI cabling, i.e. we cannot
efficiently combine the resources of multiple servers.
For servers on the same SAN to be clustered, however, cluster software must be run
on each server to coordinate and monitor server-to-server communications.
ME - CSE [SAN- Unit III] 57
Fig: Dual-pathed four-server cluster on a SAN
Failover strategies are vendor-specific but generally require mutual monitoring of server
status via a heartbeat protocol.
The heartbeat protocol sends keepalive messages between the servers as well as
notification of status or required action.
The heartbeat can be run in-band over the Fibre Channel or Gigabit Ethernet SAN
infrastructure, or out-of-band over a Fast Ethernet connection.
Other servers in the cluster assume the tasks of an individual server that has suffered a
failure.
If multiple servers are running the same application, you need additional middleware to
ensure that servers do not overwrite shared data.
Tape backup is a universal requirement, and a universal problem, for all data
networks, regardless of the specific topology employed.
Data security via backup is not only desirable but also sometimes mandated by law
(for example, in finance and banking operations).
You cannot accommodate this backup window via the production LAN without
increasing the number of switch links to accommodate block storage as well as user
bandwidth requirements.
A tape subsystem is attached to a backup server, which in turns sits on the LAN.
The backup server instructs each server to initiate a backup, with the data sent over
the LAN from server to backup server.
This type of backup involves multiple conversions. Upon launching a backup, the
target server must read blocks of SCSI data from disk, assemble the blocks into
files, and packetize the files for transfer over the LAN.
At the backup server, the inbound packets must be rebuilt into files, and the files, in
turn, are disassembled into blocks to be written to tape.
Therefore, the original data blocks that reside on the target storage undergo four
steps of conversion before reappearing at the destination as blocks:
Both the server and the backup server must devote considerable CPU cycles to
SCSI as well as network protocol overhead.
SAN-attached storage and tape offers efficiencies for tape backup by eliminating
the block-to-file conversion overhead.
For IP SANs, the block tape backup may be over a separate Gigabit Ethernet
switched network, or over a VLAN within a larger Gigabit Ethernet complex.
In both cases, by leveraging block-based SCSI transfer, you allow more data to be
transported in less time and with less overhead.
Placing servers, storage, and tape on a peer-to-peer network also enables new
backup applications such as server-free tape backup.
In addition, regular backup operations may repeatedly copy data that is unchanged
over time, and that adds to the volume and duration of the backup process.
As the volume of storage data grows, the task of securely backing up data in a
reasonable time frame becomes increasingly difficult.
Backup application software and tape vendors are trying to meet this challenge.
An enterprise should have a synchronized copy of data available and should be able
to access that copy immediately if the primary storage fails.
This is the goal of data replication, which uses disk mirroring algorithms to
duplicate data from one disk array to another.
In this case, the disk arrays must be both targets and initiators, receiving data to be
written while managing write operations to the secondary storage.
Data replication normally implies distance, with primary and secondary storage
arrays separated by at least metropolitan area distance.
Consequently, data replication must define how data mirroring will be accomplished
at the array level and how wide areas will be spanned.
ME - CSE [SAN- Unit III] 67
Fig: Active-passive (top) and active-active data replication
ME - CSE [SAN- Unit III] 68
A data replication configuration has primary and secondary storage arrays.
In the event of failure of the primary, the secondary can be accessed directly.
For companies with multiple data centers or sites, an active-active
configuration enables each site to serve as both primary for local access and
secondary for another site.
Regional centers can thus serve as mutual data replication sites for each other,
ensuring that a readily accessible copy of each site's data is always available.
In synchronous mode, the write operation is not final until both arrays have
signaled write completion.
This guarantees that an exact copy of data is now on both arrays, although at a
performance penalty.
In asynchronous mode, the primary array can buffer writes intended for the
secondary and initiate them only during idle periods.
This improves performance but may result in loss of a true copy if the write to the
secondary subsequently fails.
The primary array would then be forced to break the mirror to the secondary, and
possibly track changes in data until the secondary recovers.
Data replication can be performed within the local data center to provide a current
and readily accessible copy of data, or it can be extended over distance to facilitate
disaster recovery and business continuance scenarios.
In both synchronous and asynchronous implementations, the stability of the link
between primary and backup disk arrays is critical, as is the latency that would
naturally occur over very long haul links.
In fact, most SAN deployments are configured for a shared nothing environment,
in which individual servers are assigned separate LUNs on the storage target, Each
server manages its own data.
In a server clustering scheme, the LUNs previously assigned to a failed server can
be mapped to active servers so that data access can continue.
The SAN provides the network that facilitates this deliberate reassignment of
resources, but at any point in time each server has access only to its authorized
LUNs.
To share the storage data itself, a layer of management prevents data corruption.
You must monitor the status of a file or record that is accessible to multiple servers
on the SAN so that you can track changes.
Data sharing is further complicated by the fact that the data is typically dispersed
over multiple storage arrays as a Storage Pool.
The complex of physical storage devices on the SAN must be presented as a single
logical resource on top of which sits a common view of a file system shared by all
servers in a cluster.
A distributed volume manager must thus present a coherent view of the physical
storage resources; a distributed file system presents a uniform view of directories,
subdirectories, and files.
A distributed file system must present a consistent image to the server cluster.
If a new file is created by one server, other servers must be updated immediately.
If multiple servers may have the same file opened, and if modifications of the file
are permitted, a distributed file system must also be able to notify each server of
pending changes.
Applications that benefit from data sharing range from high-availability server
clusters to processing-intensive application clusters that must digest massive
amounts of data.
Example: Sistina's Global File System (GFS) for large server clusters that
analyze data on the distribution network area.
It will make distributed file systems and file sharing more available and further enhance
the value proposition of SANs.
During initial installation of the network because new equipment is being connected for
the first time.
In complex SAN installations, you must install host adapters and load their appropriate
device drivers, lay the cable plant, install and configure switches, position GBICs or
transceivers, and properly deploy and cable disk arrays.
After the various SAN components have been configured, cabled, and powered on, you usually
verify operation by testing a server's ability to access storage.
If a server cannot see part or all of the assigned disks, elementary troubleshooting begins.
This process typically begins with examination of the physical cable plant, port status, and
transceivers and continues through verification of the host system, interconnection, and storage
target.
Port status LEDs should indicate whether an inbound signal is recognized or whether it is
intermittently making and breaking contact.
Depending on the server's operating system, failure to discover targets may be a matter of
the SAN boot sequence.
The configuration utility supplied with the adapter card should indicate current and
allowable addresses, as well as the microcode and device driver versions for the card.
For Fibre Channel fabrics, you can examine the switch SNS table to verify the successful
connection of end devices to the switch.
Depending on vendor equipment, the default zoning configuration may exclude all
devices.
In multi-switch configurations, status of the inter-switch links may also be an issue.
E_Port connectivity may also require that you manually configure switch addresses or
designate which switch will serve as principal switch for the fabric.
Initiators and targets separated by multiple hops may have the appropriate entries in their
respective switch's SNS table but be unable to communicate because of excessive
switch-to-switch latency or failure of SNS updates to propagate through the fabric.
SAN vendors have attempted to provide more advanced diagnostic capabilities in their
products, the ability to crack frames and provide protocol decode at multi-gigabit speeds
would add significant cost to any product.
A full diagnostic of a storage network therefore requires protocol analyzers and either in-
house expertise or a contracted support organization with trained personnel
Even with enormous buffers, it is not possible to capture more than a few seconds of
traffic at 1Gbps and 2Gbps speeds.
You can, capture protocol events that may last some microseconds or milliseconds of
time.
Typically, an analyzer is used to trace a specific process, such as a fabric login problem.