Вы находитесь на странице: 1из 49

Getting to know the SAN stack

James C. McPherson Solaris Datapath Engineering Storage Group Sun Microsystems

Topics to cover (1)


The big picture (SAN driver stack) Switches and attached devices HBA, multipathing, target and layered drivers fp, fctl, fcsm, fcp, fcip Userland libraries Userland utilities Debugging features - mdb

Topics to cover (2)


Debugging features Solaris CAT Some useful data structures Putting it all together The NWS consolidation Source code location and structure References for further reading

The big picture (physical, kernel, userland)


Userland utilities (luxadm|fcinfo|cfgadm) Userland libraries (libg_fc|liba5k|libHBAAPI|libsun_fc)
[md(svm)] [vxvm, vxdmp] sd st sgen (target drivers) scsi_vhci/STMS(mpxio) fcp fcsm streams dlpi fcip iSCSI

fp fctl (Fibre Channel Transport Layer) HBAs: QLogic, Emulex, JNI Switches: Brocade, McData, QLogic Physical devices: disks, tapes, ....

The big picture (inside the kernel)


userland userland Layered storage drivers md, rdac, vxvm, zfs, vxdmp SCSI Target drivers: sd, st, sgen... Solaris SCSA Framework scsi_vhci
(aka STMS or mpxio)

TCP/IP Streams stack and modules DLPI

fcp fcsm

fcip iSCSI

fp fctl (Fibre Channel Transport Layer)

Switches and Attached Devices


In general, devices attached to a fibre switch are usually SCSI targets or HBAs (host bus adapters) SCSI targets can be disks or disk-array luns, tapes, media changers HBAs can work as targets and initiators for storage (SAN) or networking purposes (IP over FC) Switches provide a routing service for FibreChannel packets between attached device and hosts Switches provide naming, configuration and other utility services

HBA drivers: qlc, emlxs and jfca


Drivers provided to Sun by the HBA manufacturers for both sparc and x86|x64 architectures: Qlogic: Emulex: AMCC (formerly JNI Corp): qlc emlxs jfca (sparc only, EOL)

Sun Storage (group/division/...)'s Solaris Datapath Engineering has the source for each of these drivers for reference purposes, but.... HBA manufacturers do bugfixes and sustaining

HBA drivers: qlc, emlxs and jfca


Sun does not write special fcode or bios for HBAs Yes, cards that Sun sells will show up as SUNW,qlc or SUNW,emlxs BUT that is identification done by Qlogic and Emulex at their factories. If you watch the OBP console output closely at power-on, you'll see something like this:
/pci@7c0/pci@0/pci@8: Device 0 lpfc SUNW,emlxs fp disk lpfc SUNW,emlxs fp disk

The PCI vid/did combinations are the same as appear on linux or MS-Windows

Qlogic Based HBA Support Matrix


Vendor
Sun Sun Sun Sun Sun Sun Sun Sun Sun Qlogic Qlogic Qlogic Qlogic Qlogic Qlogic Qlogic Qlogic Qlogic Qlogic Qlogic Qlogic Qlogic Qlogic Qlogic Qlogic Qlogic Qlogic Qlogic

HBA
SG-XPCI1FC-QLC X6799A SG-XPCI1FC-QF2 or x6767A SG-XPCI2FC-QF2 or x6768A X6727A SG-XPCI1FC-QF4 SG-XPCI2FC-QF4 SG-XPCIE1FC-QF4 SG-XPCIE2FC-QF4 QCP2340 QCP2342 QLA200 QLA210 QLA2310 QLA2310F/QLA2310FL QLA2340/QLA2340L QLA2342/QLA2342L QLA2344/QLA2344-P QLA2440 QLA2460 QLA2462 QLE2360 QLE2362 QLE2440 QLE2460 QLE2462 QSB2340 QSB2342

Vendor ID Device ID Subsys Vendor ID Subsys ID


1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 6322 2200A 2310 2312 2200A 2422 2422 2432 2432 2312 2312 6312 6322 2310 2310 2312 2312 (2)2312 2422 2422 2422 2432 2432 2432 2432 2432 2312 2312 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 1077 132 4082 106 10A 4083 140 141 142 143 109 10B 119 12F 106 9 100 101 102 145 133 134 117 118 147 137 138 104 105

Minimium Solaris Version or Patch Level


Solaris 8 Solaris 9 Not Supported Solaris 10 x86 Solaris 10 Sparc Not Supported

use the "prtpicl -v" command to find the following

SAN 4.4.8

S10 update 1 or S10 + 119130-13

Not Supported S10 update 1 or S10 + 119131-13 SAN 4.4.8

Not Supported

S10 update 1 or S10 + 119130-13

Not Supported SAN 4.4.8 Not Supported SAN 4.4.8

Not Supported S10 update 1 or S10 + 119130-13 Not Supported S10 update 1 or S10 + 119130-13

Emulex Based HBA Support Matrix


Vendor HBA Subsys Vendor Device Subsys Vendor ID ID ID ID Model Minimium Solaris/Patch Level Code Name Solaris Solaris 10 Solaris 9 8 x86 LP10000-S LP10000DC-S LPe11000-S Not Supported SAN 4.4.6 Solaris 10 Sparc Rainbow use the "prtpicl -v" command to find the following Sun Sun Sun SG-XPCI1FC-EM2 SG-XPCI2FC-EM2 SG-XPCIE1FCEM4 10df 10df 10df fc00 fc00 fc20 10df 10df 10df fc00 fc00 fc21 S10 update 1 S10 update 1 or S10 + or S10 + 120223-04 120222-04 S10 update 2 S10 update 2 or S10 + or S10 + Patch Patch 102223-06 120222-06

SummitE SG-XPCIE2FCSun EM4 10df fc20 10df fc22 LPe11002-S Sun SG-XPCI1FC-EM4 10df Fc10 10df Fc11 LP11000-S TBD PyramidE Sun SG-XPCI2FC-EM4 10df Fc10 10df Fc12 LP11002-S Emulex LP10000 10df fa00 10df fa00 LP10000 Emulex LP10000DC 10df fa00 10df fa00 LP10000DC Emulex LP10000ExDC 10df fa00 10df fa00 LP10000ExDC Emulex LPe11000 10df fe00 10df fe00 LPe11000 Emulex LPe11002 10df fe00 10df fe00 LPe11002 S10 update 1 S10 update 1 Emulex LP11000 10df fd00 10df fd00 LP11000 SAN 4.4.7 or S10 + or S10 + N/A 120223-04 120222-04 Emulex LP11002 10df fd00 10df fd00 LP11002 Emulex LP9802 10df f980 10df f980 LP9802 Emulex LP9002DC 10df f900 10df f900 LP9002DC Emulex LP9002L 10df f900 10df f900 LP9002L Emulex LP9002S 10df f095 10df f095 LP9002S Notes: Model numbers ending in -S are Sun HBAs, Model numbers with no - extension are Emulex HBAs, Model numbers ending with -E are EMC HBAs. All will enumerate under the emlxs driver.

Fibre Channel Fabric Boot


Sparc Boot HBA Minimium Solaris Version or Patch Level X86/X64 Boot UBC Fcode/BIOS Code Name Minimium Solaris Version or Patch Level Minimium Level Solaris 8 Solaris 9 Solaris 10 Solaris 10 Not Supported Prism Amber Fcode: 1.14.11 Amber 2 BIOS: TBD SAN 4.4.2 S10 FCS Crystal 2a Crystal + S10 Update 1 SAN 4.4.8 Pyramid UBC: TBD Fcode: 1.11 BIOS 1.04 Not Supported S10 Update 1 Summit SAN 4.4.7 Not Supported S10 Update 2 TBD UBC:5.01a4 Fcode 1.50a4 BIOS 1.70a3 Rainbow SummitE PyramidE

SG-XPCI1FC-QLC X6799A SG-XPCI1FC-QF2 or x6767A SG-XPCI2FC-QF2 or x6768A X6727A SG-XPCI1FC-QF4 SG-XPCI2FC-QF4 SG-XPCIE1FC-QF4 SG-XPCIE2FC-QF4 SG-XPCI1FC-EM2 SG-XPCI2FC-EM2 SG-XPCIE1FC-EM4 SG-XPCIE2FC-EM4 SG-XPCI1FC-EM4 SG-XPCI2FC-EM4

Target and layered drivers


Multipathing drivers such as scsi_vhci (aka StorEdge Traffic Management Software/STMS/MPxIO/Solaris Multipathing). This sits below the target drivers in the kernel The fcp driver provides device discovery in the SCSI layer Target drivers use sd/ssd (for disks), st (for tapes) and sgen (for generic devices such as tape changers) and sit on top of scsi_vhci/vxdmp if used. So-called layered drivers provide more functionality on top of the target drivers. The two best examples of these are md (for SVM) and vxvm (for Veritas Volume Manager). The zfs driver is also layered on top of the target drivers

Multipathing with scsi_vhci


The scsi_vhci driver provides a T10 ALUA-compliant multi-pathing and failover driver ALUA is Asymmetric Logical Unit Access, defined in T10 standard SPC-4, SCSI Primary Commands rev 4. If your storage is T10 ALUA-compliant and operates in an active/active mode, it should just work with scsi_vhci refer to Solaris Fibre Channel and Storage Multipathing http://docs.sun.com/source/819-0139 Various vendor-specific implementations of ALUA hardware are also supported (typically EMC and Engenio/LSI)

The fp (FC Port) driver

Performs the login to and logout of switches PLOGI+PLOGO (point-to-point), LOGI+LOGO (fabric) Handles basic accept and reject BA_ACC, BA_RJT Handles the Extended Link Services accept and reject ELS_ACC, ELS_RJT Logs in and out of, and queries the fabric name service Creates the per-port loop map as required Passes information about changes (new, old, disconnected) luns to fcp and thence to devfsadmd threads

The fcsm (FC San Management) driver

Implements the FC Management Server (Fabric) configuration commands Provides relatively-direct access to switch Management Server operations via the ioctl(2) interface You're unlikely to use this directly

The fctl (FC Transport Layer) driver

Handles orphan ports Maintains the list of WWNs attached to this port (both local and remote) Doles out the SFK work through job_requests Provides utility functions for the rest of the SFK stack

The fcp (FC Protocol - layer4) driver

Does the work of encapsulating SCSI commands inside FC frame structures When your thread needs SCSA access, fcp does the job Handles scsi target device discovery Talks to the NDI (bus nexus) and MDI (multipath driver interface) frameworks for device discovery Generically, fcp routes SCSI packets to and from targets

The fcip (IP+Arp encapsulated in FC) driver

Does the work of encapsulating IP and ARP packets inside FC frame structures Maintains the routing table for fcip instances

Userland Libraries

libg_fc and liba5k are used mainly for luxadm (libg_fc and liba5k are sparc only) libHBAAPI and libsun_fc are used for cfgadm_fp libima and libsun_ima are the Multipath Management API libHBAAPI and libima are SNIA source code libsun_fc and libsun_ima are the vendor-specific plugins that Sun provides for libHBAAPI and libima

Userland Utilities
We've got four: luxadm (supposed to go away... not soon enough!)
luxadm(1M)

cfgadm (with the fp plugin)

cfgadm(1M) cfgadm_fp(1M) fcinfo(1M)

fcinfo (new in Solaris 10 update 1) iscsiadm (new in Solaris 10 update 1)


iscsiadm(1M)

Userland Utilities
# fcinfo hba-port HBA Port WWN: 210000e08b954220 OS Device Name: /dev/cfg/c2 Manufacturer: QLogic Corp. Model: QLE2462 Type: N-port State: online Supported Speeds: 1Gb 2Gb 4Gb Current Speed: 4Gb Node WWN: 200000e08b954220 HBA Port WWN: 210100e08bb54220 OS Device Name: /dev/cfg/c3 Manufacturer: QLogic Corp. Model: QLE2462 Type: N-port State: online Supported Speeds: 1Gb 2Gb 4Gb Current Speed: 2Gb Node WWN: 200100e08bb54220

Userland Utilities
# fcinfo remote-port -p 210000e08b110125 Remote Port WWN: 256000c0ffc7ecd2 Active FC4 Types: SCSI SCSI Target: yes Node WWN: 206000c0ff07ecd2

# fcinfo remote-port -l -p 210000e08b110125 Remote Port WWN: 256000c0ffc7ecd2 Active FC4 Types: SCSI SCSI Target: yes Node WWN: 206000c0ff07ecd2 Link Error Statistics: Link Failure Count: 0 Loss of Sync Count: 0 Loss of Signal Count: 0 Primitive Seq Protocol Error Count: 0 Invalid Tx Word Count: 0 Invalid CRC Count: 0

Userland Utilities
# fcinfo remote-port -s -p 210000e08b110125 Remote Port WWN: 256000c0ffc7ecd2 Active FC4 Types: SCSI SCSI Target: yes Node WWN: 206000c0ff07ecd2 LUN: 0 Vendor: SUN Product: StorEdge 3511 OS Device Name: /dev/rdsk/c0t600C0FF00000000007ECD20CD4BBE500d0s2 LUN: 1 Vendor: SUN Product: StorEdge 3511 OS Device Name: /dev/rdsk/c0t600C0FF00000000007ECD20CD4BBE501d0s2 LUN: 2 Vendor: SUN Product: StorEdge 3511 OS Device Name: /dev/rdsk/c0t600C0FF00000000007ECD20CD4BBE502d0s2 ....

Userland Utilities
# cfgadm -la -o show_SCSI_LUN Ap_Id c1 c1::200400a0b81770cf,0 c1::200400a0b81770cf,1 c1::200400a0b81770cf,31 c2 c2::210100e08b275cb5 c2::210100e08b27abb5 c2::50020f2300000cf0,0 c2::50020f2300000cf0,1 c2::50020f2300000cf0,2 c2::50020f2300000cf0,3 c2::50020f2300004bf0,0 c2::50020f2300004bf0,1 c2::50020f2300004bf0,2 ... Type fc-private disk disk disk fc-fabric unknown unknown disk disk disk disk disk disk disk Receptacle connected connected connected connected connected connected connected connected connected connected connected connected connected connected Occupant configured configured configured configured configured Condition unknown unknown unknown unknown unknown

unconfigured unknown unconfigured unknown configured configured configured configured configured configured configured unknown unknown unknown unknown unknown unknown unknown

HBAs visible from this port, in the same zone as this port

Userland Utilities
# cfgadm -la -o show_FCP_dev Ap_Id c1 c1::200400a0b81770cf,0 c1::200400a0b81770cf,1 c1::200400a0b81770cf,31 c2 c2::210100e08b275cb5 c2::210100e08b27abb5 c2::50020f2300000cf0,0 c2::50020f2300000cf0,1 c2::50020f2300000cf0,2 c2::50020f2300000cf0,3 c2::50020f2300004bf0,0 c2::50020f2300004bf0,1 c2::50020f2300004bf0,2 .... Type fc-private disk disk disk fc-fabric unknown unknown disk disk disk disk disk disk disk Receptacle connected connected connected connected connected connected connected connected connected connected connected connected connected connected Occupant configured configured configured configured configured Condition unknown unknown unknown unknown unknown

unconfigured unknown unconfigured unknown configured configured configured configured configured configured configured unknown unknown unknown unknown unknown unknown unknown

Debugging features (1)


We provide dcmds and walkers in mdb SFK support is being added to Solaris CAT v5 If you run 'touch /var/adm/sun_fc.debug' then any command which uses libsun_fc will write trace information to it luxadm allows you to look at various link statuses cfgadm show basic multipath status information fcinfo shows you what local ports you have, and also what remote ports, devices and targets are attached

Debugging features (2) dcmds in mdb


Some of our dcmds are
::fcptrace ::fptrace ::fcip ::fcport ::remote_port ::ulps displays the FCP trace buffer displays the FP trace buffer displays FCIP instances displays FCP port instances displays remote FC port instances displays the Upper Layer Protocol modules installed and their IDs ::ulpmods displays the port to ULP mapping ::emlxs_msgbuf Dumps the emlxs driver message buffer ::emlxs_show Shows any structure in the emlxs driver ::qlclinks Print qlc link information ::qlcstate Print qlc adapter state information ::emlxs_msgbuf Dumps the emlxs driver internal msgbuf ::emlxs_show Shows the contents of any emlxs struct

Of the above, the fcptrace and fptrace buffers are probably the most useful to get started with debugging an issue

Debugging features (3) dcmds in mdb


You can find out what dcmds and walkers are in a particular module by running ::dmods -l [module name] NWS modules for mdb are fctl, fcp, fcip, qlc and emlxs
> ::dmods -l fctl fctl dcmd fcport dcmd fcptrace dcmd fptrace dcmd ports dcmd remote_port dcmd ulpmods dcmd ulps walk job_request walk orphan walk pd_by_did walk pd_by_pwwn walk ports walk ulpmods walk ulps - Display a Leadville fc_local_port structure - Dump the fcp trace buffer, optionally supplying starting and ending packet numbers. - Dump the fp trace buffer, optionally supplying starting and ending packet numbers. - Leadville port list - Display fc_remote_port structures - Leadville ULP module list - Leadville ULP list - walk list of job_request structures for a local port - walk list of orphan structures for a local port - walk list of fc_remote_port structures hashed by D_ID - walk list of fc_remote_port structures hashed by PWWN - walk list of Leadville port structures - walk list of Leadville ULP module structures - walk list of Leadville ULP structures

Debugging features (4) fcp trace buffer


The FCP and FP drivers keep a trace buffer (pointed to by (ss)fcp_logq and (ss)fp_logq) where events of interest are noted
> ::fcptrace [Tue [Tue [Tue [Tue [Tue ... [Tue [Tue Dec Dec Dec Dec Dec 20 20 20 20 20 14:38:49 14:38:49 14:38:49 14:38:49 14:38:49 2005] 2005] 2005] 2005] 2005] 13=>ssfcp(2)::PLOGI to d_id=0x80200 succeeded, wwn=c0006025d2ecc7ff 14=>ssfcp(2)::ssfcp_send_els: d_id=0x80200 ELS 0x20 (PRLI) 15=>ssfcp(2)::ssfcp_send_els: returning 0 16=>ssfcp(2)::ELS (20) callback state=0x1 for 80200 17=>ssfcp(2)::PRLI to d_id=0x80200 succeeded

Dec 20 14:38:49 2005] 20=>ssfcp(2)::ssfcp_handle_reportlun: port=2, tgt D_ID=0x80200 Dec 20 14:38:49 2005] 21=>ssfcp(2)::!Dynamically discovered 18 LUNs for D_ID=80200

Each entry has a timestamp, a monotonically-increasing sequence number, the fcp or fp instance (ssfcp prior to Nevada) making the entry, and the actual message itself

Debugging features (5) fp trace buffer


> ::fptrace [Tue Dec 20 14:38:49 2005] 47=>fp(2)::RSCN with D_ID page; port=ffffffff85180000, d_id=80200, pd=0 [Tue Dec 20 14:38:49 2005] 48=>fp(2)::NS Query response, cmd_code=112, xfer_len=8 [Tue Dec 20 14:38:49 2005] 49=>fp(2)::GPN_ID results; 25 60 0 ffffffc0 ffffffff [Tue Dec 20 14:38:49 2005] 50=>fp(2)::NS Query Response for D_ID page; rev=1, in_id=0, cmdrsp=8002, reason=0, expln=0, rval=0 [Tue Dec 20 14:38:49 2005] 51=>fp(2)::new port attached to domain, calling fp_validate_area_domain [Tue Dec 20 14:38:49 2005] 52=>fp(2)::GAN response; port=ffffffff85180000, d_id=80200 [Tue Dec 20 14:38:49 2005] 53=>fp(2)::GAN response details; port=ffffffff85180000, d_id=80200, type_id=20801, pwwn=2560 0 c0 ff c7 ec d2, nwwn=20 60 0 c0 ff 7 ec d2 [Tue Dec 20 14:38:49 2005] 54=>fp(2)::GAN PD stuffing; pd=ffffffff867c8800, port_id=20801, sym_len=28 fc4-type=10000 [Tue Dec 20 14:38:49 2005] 55=>fp(2)::GAN response; port=ffffffff85180000, d_id=80600 [Tue Dec 20 14:38:49 2005] 56=>fp(2)::fp_validate_area_domain: get_devcount found 1 devices attached to port 0xffffffff85180000

Note that we've got the same format as for the FCP trace buffer....

Debugging features (6) fp trace buffer


It's really important that we get trace buffer contents when you log a bug FC analyser traces tell us what happens between the HBA and the attached storage Trace buffers tell us what Leadville {fp|fcp|fcip|fctl|fcsm} does (and what they think is going on) When you log a bug, give us a crash dump (live or dead) and we'll work out whether an FC analyser trace is needed.

Debugging features (7) ports!


> ::ports Port I# State Soft FCA Handle ffffffff85180000 2 401 0 ffffffff8252f200 ffffffff84058000 3 0 0 ffffffff8252fb00 ffffffff83332000 1 0 0 ffffffff835d5040 ffffffff83372000 0 0 0 ffffffff82eff040 Port DIP ffffffff83c78dc0 ffffffff81d3b400 ffffffff83c77dc8 ffffffff835c5098 FCA Port DIP ffffffff807613d8 ffffffff807611f8 ffffffff807615b8 ffffffff80761798

The fields have types as follows: Port: fc_local_port (snv) or fc_port_t (s10 and earlier) I#: instance number State: port speed (msb) and actual state (lsb) Soft: soft_state defined in fc_portif.h FCA Handle: opaque pointer for the local port device Port DIP: port dip (dev_info_t) FCA Port Dip: fca dip (dev_info_t)

Debugging features (8) ULP info


> ::ulps ULP Name Type Revision FCSM 20 2 SunFC FCIP v20050927-1.43 5 2 fcp 8 2 > ::ulpmods Type Port Handle dstate statec 8 ffffffff83372000 1 2 8 ffffffff83332000 1 2 8 ffffffff84058000 1 2 8 ffffffff85180000 1 0 5 ffffffff83332000 1 2 5 ffffffff83372000 1 2 5 ffffffff84058000 1 2 5 ffffffff85180000 1 0 20 ffffffff85180000 1 0 20 ffffffff83332000 1 2 20 ffffffff83372000 1 2 20 ffffffff84058000 1 2

ULPS are Upper Layer Protocols

Debugging features (9) mdb walkers


Walkers in mdb:

cmds fcp fcpX_cache fcsm_job_cache fctl_cache fpX_cache luns pd_by_did pd_by_pwwn ports targets ulpmods ulps qlcstates

- walk list of SCSI commands in fcp's per-lun queue - walk list of Leadville fcp instances - walk the fcpX_cache cache - walk the fcsm_job_cache cache - walk the fctl_cache cache - walk the fpX_cache cache - walk list of LUNs in an fcp target - walk list of fc_remote_port structures hashed by D_ID - walk list of fc_remote_port structures hashed by PWWN - walk list of Leadville port structures - walk list of fcp targets attached to the local port - walk list of Leadville ULP module structures - walk list of Leadville ULP structures - walk list of qlc ql_state_t structures

Debugging features (10) mdb walkers


Usage example (Solaris 10):
*ssfcp_port_head::walk fcp|::walk targets|::walk luns

which gives you a struct ssfcp_lun which you can mdb-pipe to ::print like this:
*ssfcp_port_head::walk fcp|::walk targets|::walk luns|::print -t struct ssfcp_lun

Debugging features (11) Solaris CAT


The Solaris CAT support functions and routines mirror that of mdb Note: currently under development, tentatively targetted for Solaris CAT v5 Command is called san and will have these options: fptrace [-s m][-e n] dumps the FP trace buffer (-s start, -e end packet #) fcptrace [-s m][-e n] dumps the FCP trace buffer (-s start, -e end packet #) ports [-l|-r|-h|-v] [WWN|port#] shows all ports, -l local ports, -r remote ports -h hba info, -v all info with WWN, shows info for WWN only with port#, shows info for port instance# only

targets [WWN|port#] shows targets, with WWN shows targets attached to that WWN or port instance only luns {WWN|port#} shows luns attached to WWN or port instance

Debugging features (12) Data structures


There are several data structures which you should know about: fp_cmd fc_packet fc_local_port / fc_remote_port (nevada) fc_port / fc_port_device (s10) job_request fc_orphan fca_port You can explore these in mdb with ::print -t struct [name of structure] You can explore these in Solaris CAT v4.2 and later with stype [name of structure] Search for their definitions at http://cvs.opensolaris.org/source/xref/nwsc/src/sun_nws

Debugging features (13) Data structures


We embed multipathing pointers within the (ss)fcp_lun structure: struct ssfcp_lun { (size: 0xc8 bytes) ... int lun_mpxio; typedef child_info_t * = void * *lun_cip; (offset 0x20 bytes, size 0x8 bytes) ... } If lun_mpxio is 0 (ie, not a multipathed lun), then the lun_cip pointer is a struct dev_info If lun_mpxio is 1 (multipathed), then the lun_cip pointer is a struct mdi_pathinfo

Putting it all together (1) from app to target


App issues write(2) or read(2) command System call interface routes this to SCSA layer, calls down into fcp fcp encapsulates the SCSI command it receives from the target driver into a fibre-channel packet, fills in the address portions appropriately, looks up the correct hba to send the fc packet to and sends it out The hba accepts the packet and sends it out over the fibre If we go through a switch, the switch routes the packet to the correct device, otherwise the device we send the packet to accepts it and deals with it

Putting it all together (2) from target to app


Target puts data into its buffer, sends an interrupt out to the switch (or hba if directly attached). Switch routes packet to D_ID (destination ID) The hba accepts the packet and sends it to fcp fcp un-encapsulates the data from the packet structures, and when the SCSI packet is complete, passes the SCSI packet up to a multipathing (scsi_vhci/vxdmp) or target driver (sd/st/sgen/sg) The target driver passes the SCSI packet up to a layered driver such as md/vxvm/zfs/... if required Either the target driver or the layered driver then provides your data in a buffer to your app It's just like any other SCSI-attached storage, so that you (app writer/sysadmin) don't have to worry about the mechanics of what happens..... It Just Works! (tm)

Putting it all together Device Discovery (1)


A new device is added to the zone, so the switch sends a RSCN (Registered State Change Notification) to each port which has registered to receive RSCNs Each port then invokes its RSCN handler. For fp, that's fp_validate_unsol_rscn() fp repeatedly interrogates the switch with a GPN_ID query (Get Port Next, by ID) and then Fills out the list of old devices and makes sure the LILP map is correct, or Queries the switch's Name Service for each device in the zone and then re-validates the port descriptor tables If required, we then hand it all off to fcp....

Putting it all together Device Discovery (2)


Each fcp instance has a hotplug task (fcp_hp_task), which gets triggered by fp/fctl/fcsm fcp_hp_task then calls fcp_trigger_lun to either take the device offline, or bring it online If we're offlining the lun, then we call fcp_offline_lun, which palms the work off to ndi_devi_offline for a non-mpxio lun, or mdi_pi_offline otherwise If we're onlining the lun, we call fcp_get_cip to get the appropriate device path and then throw the device at ndi_devi_online (non-mpxio) or mdi_pi_online.

What about standards?


Acronym
FC-PH FC-PH-2 FC-PH-3 FC-AL FC-AL-2 FC-FG FC-SW FC-GS FC-GS-2 FC-LE FC-SB FC-PLDA 10 BIT FC-FLA SCSI-FCP SCSI-GPP

Title
Fibre Channel Physical Interface Fibre Channel Physical Interface, gen 2 Fibre Channel Physical Interface, gen 3 Arbitrated Loop Arbitrated Loop, gen 2 Generic Fabric Switched Fabric Generic Services Generic Services, gen 2 Link Encapsulation Single-Byte command set mapping Private Loop Direct Attach 10-bit Interface Fabric Loop Attachment SCSI-3 encapsulation in FC Generic Packetized Protocol

www.t10.org for SCSI

www.t11.org for FC

Code Pointers

The NWS consolidation is available at http://cvs.opensolaris.org/source/xref/nwsc You can search and browse the source using OpenGrok You can download snapshots of the NWS consolidation from http://www.opensolaris.org/os/downloads http://mp-mgmt-api.sourceforge.net MPAPI http://www.snia.org/apps/org/workgroup/os-attach

References and Further Reading (1)


www.sun.com/storagetek/storage_networking San Foundation Kit home www.opensolaris.org OpenSolaris! docs.sun.com/source/819-0139 Solaris Fibre Channel and Storage Multipathing sunsolve.sun.com infodocs and patches www.sun.com/bigadmin A portal for system administration topics

References and Further Reading (2)


www.t11.org FibreChannel Standards Committee www.t10.org SCSI Standards Committee www.snia.org Storage Networking Industry Association www.qlogic.com QLogic Corporation www.emulex.com Emulex Corporation

Blogs
blogs.sun.com/roller/page/jmcp James McPherson blogs.sun.com/roller/page/torrey Torrey McMahon blogs.sun.com/roller/page/AaronDailey Aaron Dailey blogs.sun.com/roller/page/dweibel David Weibel blogs.sun.com/roller/page/hbainsights Sumit Gupta

What Questions Do You Have ????

Getting to know the SAN Stack


James C. McPherson James.McPherson@Sun.COM

Image copyright 2006 James C. McPherson

Вам также может понравиться