Вы находитесь на странице: 1из 4

Page 1 of 4

">

Sun Storage[TM] Arrays: Troubleshooting RAID Controller Failures [ID 1021113.1] Modified 07-JUN-2011 Type TROUBLESHOOTING Migrated ID 271129 Status PUBLISHED

Applies to:
Sun Storage 2510 Array - Version: Not Applicable and later [Release: N/A and later ] Sun Storage 2530 Array - Version: Not Applicable and later [Release: N/A and later] Sun Storage 2530-M2 Array - Version: Not Applicable and later [Release: N/A and later] Sun Storage 2540 Array - Version: Not Applicable and later [Release: N/A and later] Sun Storage 2540-M2 Array - Version: Not Applicable and later [Release: N/A and later] All Platforms

Purpose
To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - 6000 and 2500 Series RAID Arrays The purpose of this document is to describe how to troubleshoot Sun Storage[TM] RAID controller failures. Symptoms: Seven Segment Display of controller shows a repeating pattern 88 L# (where # is some value) Amber LED on controller Critical Fault for RPA Memory Error(xx.66.1041) Critical Fault for Controller is Offline(xx.66.1028) Please validate that each troubleshooting step below is true for your environment. Each step will provide instructions via a link to a document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.

Last Review Date


June 15, 2010

Instructions for the Reader


A Troubleshooting Guide is provided to assist in debugging a specific issue. When possible, diagnostic tools are included in the document to assist in troubleshooting.

Troubleshooting Details
1. Verify Array Critical Faults Reference Document:1021057.1 Verify Sun StorageTek[TM] 2500 and Sun Storage[TM] 6000 Critical Faults via the User Interface If fault listed as OFFLINE or RPA Memory Error, go to Step 2. Otherwise go to Step 11. 2. Verify Array Model Reference Document:1021066.1 Verify Sun Storage[TM] Array Array Type via the User Interface
Array Model Sun StorageTek[TM] 2510 Sun StorageTek[TM] 2530 Sun StorageTek[TM] 2540 Instructions If the Critical Fault from Step 1 was Controller Offline, go to Step 5. If the Critical Fault from Step 1 was RPA Memory Error, go to Step 7.

Sun StorageTek[TM] 6140 Sun StorageTek[TM] 6540 Sun StorageTek[TM] Flexline 380

If the Critical Fault from Step 1 was Controller Offline, go to Step 3. If the Critical Fault from Step 1 was RPA Memory Error, go to Step 7.

Sun StorEdge[TM] 6130 Sun StorageTek[TM] Flexline 240 Sun StorageTek[TM] Flexline 280

If the Critical Fault from Step 1 was Controller Offline, go to Step 5. If the Critical Fault from Step 1 was RPA Memory Error, go to Step 10.

Sun Storage 6180 Sun Storage 2530 M2 Sun Storage 2540 M2

If the Critical Fault from Step 1 was Controller Offline, go to Step 4. If the Critical Fault from Step 1 was RPA Memory Error, go to Step 10.

Sun Storage 6580 Sun Storage 6780

If the Critical Fault from Step 1 was Controller Offline, go to Step 4. If the Critical Fault from Step 1 was RPA Memory Error, go to Step 7.

3. Verify 7-segment Display on 6140/6540/FLX380 array controller. Currently the user interface does not display what is being shown on the seven segment display that normally shows the tray ID for the array under optimal conditions. For arrays of these types, we can get additional status of the system. If you are not local to the system, you will need someone to look at the ID. The display can vary based on the array model and the error status. Reference Document:1021109.1 Sun StorageTek[TM] 6140, 6540, and Flexline 380 Array Controller 7-Segment LED If Seven Segment Display shows "88", this indicates a possible intermittent issue, go to Step 5.

https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=TROUBL...

1/31/2012

Page 2 of 4

If 7-Segment Shows L2 or L3 for the controller, the subsystem has offlined the controller due to persistent memory faults(L2) or Hardware(L3). Go to Step 10. If the 7-Segment Shows an L-code but is not L2 or L3 go to Step 11. 4. Verify 7-segment Display on 6180/6580/6780/2530M2/2540M2 array controller. Currently the user interface does not display what is being shown on the seven segment display that normally shows the tray ID for the array under optimal conditions. For arrays of these types, we can get additional status of the system. If you are not local to the system, you will need someone to look at the ID. The display can vary based on the array model and the error status. Reference Document:1021110.1 Sun Storage[TM] 6x80 and 2500-M2 Array Controller 7-Segment Display For 2530-M2/2540-M2/6180/6580/6780: If 7-Segment Display flashes either OS+ OL+ blank- or SE+ 88+ blank-, this indicates a possible intermittent issue, go to Step 5. If 7-Segment Display flashes: 0E+ L2+ dash+ CF+ P#+ blank-, SE+ dF+ dash+ CF+ P#+ blank-, or OE+ L3+ blank-, this indicates that a Controller Processor Memory DIMM has failed due to parity errors(L2) or system has detected hardware fault, and has placed the controller offline, go to Step 10. If 7-Segment Display flashes: 0E+ L2+ dash+ CF+ d#+ blank-, or SE+ dF+ dash+ CF+ d#+ blank-, this indicates that the Processor Memory on the 6180 has failed due to Parity errors, or the system detected a hardware fault, and placed the controller offline. Go to Step 10. FOR 6580/6780 ONLY If 7-Segment Display flashes: 0E+ L2+ dash+ CF+ C#+ blank-, or SE+ dF+ dash+ CF+ C#+ blank-, this indicates that the Controller Data Cache Memory DIMM has failed due to parity errors, and has placed the controller offline. Reference Document:1117584.1 Troubleshooting Sun Storage [TM] 6580/6780 Cache Memory DIMM Faults If 7-Segment Display flashes: SE+ dF+ dash+ CF+ H#+ blank-, this indicates that the Host Interface Card(HIC) in slot # for the controller is either failed or missing, reference Document:1120725.1 Troubleshooting Sun Storage[TM] 6580/6780 Host Interface Card Faults. If 7-segment Display flashes: SE+ L8+ blank+ CF+ Cx+ blank-, this indicates that the cache configuration does not match the alternate controller's configuration. reference Document:1117584.1 Troubleshooting Sun Storage[TM] 6580/6780 Cache Memory DIMM Faults. If none of the errors above are displayed, go to Step 10. 5. Online the RAID controller. Make an attempt to online the RAID controller, using the user interface. The symptoms that have been indicated, thus far, point to something other than a hardware problem on the RAID controller itself. Sun Storage[TM] Common Array Manager Browser 1. 2. 3. 4. 5. Expand Storage Array in the left window menu tree Click on your array name Click on the Service Advisor button in the top right corner of the browser window. Find and Expand Place a Controller Online in the Troubleshooting and Recovery Section Select the faulted controller, and follow the instructions in the right hand pane to place the controller online.

Service CLI Locations: Solaris: /opt/SUNWSefms/bin Windows: c:\Program Files\Sun\Common Array Manager\Component\fms\bin Linux: /opt/sun/private/fms/bin

service -d array_name -c revive -t [a | b]

NOTE: You must specify controller slot location A or B.

Sun StorageTek[TM] SANtricity Storage Manager GUI 1. 2. 3. 4. 5. 6. SMcli Open the Array Management Window for your array Select the array controller (will have a red X on it) Open the Advanced Menu Select the Recovery Sub-Menu Select the Place Controller Sub-Menu Select Online

SMcli -n array_name -c "set controller [(a|b)] availability=online;"

NOTE: You must specify controller slot location A or B.

If the request to online the controller fails, go to Step 11. If the request to online the controller is successful, and the controller stays up for longer than 5 minutes, go to Step 6. If the request to online the controller was successful, but the controller went offline again, go to Step 11. 6. Reset SOC and RLS counters on the array for monitoring. You have indicated that the array controller was successfully placed online and made available for longer than 5 minutes. If this issue is intermittent, the controller may go offline again. In order to help with diagnosis in the event that this occurs, we need to set baselines for error statistics on the array. The RLS(Read Link Status and SOC(Switch On Chip) statistics are collected as part of normal array support collections, and can be zeroed out very easily for further diagnosis, as follows. Often, a controller will go offline due to a communication issue, which requires this data as part of the investigation.

https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=TROUBL...

1/31/2012

Page 3 of 4

NOTE: This is not available for 2510, 2530, or 2540 arrays, although the collection of the error counters is. If this is your array type, you do not need to run the commands, but should follow the instructions on what action to take, regardless. Sun Storage Common Array Manager Service CLI Locations: Solaris: /opt/SUNWSefms/bin Windows: c:\Program Files\Sun\Common Array Manager\Component\fms\bin Linux: /opt/sun/private/fms/bin

service -d array_name -c reset -t soc service -d array_name -c reset -t rls

Sun StorageTek SANtricity Storage Manager SMcli

SMcli -n array_name -c "reset storageArray RLSBaseline;" SMcli -n array_name -c "reset storageArray SOCBaseline;"

If the array remains online for longer than 48 hours, monitor for a period of 2 weeks. After that point, the problem was likely due to a software error or state inconsistency. You may want to consider updating firmware if available. No further actions are required. If the array controller is placed offline in less than 2 weeks, go to Step 11. 7. Check for additional Critical Faults besides the RPA Memory Error Due to bugs 6767241 and 6797173, the RPA Memory Error may be false. Check the list of faults on the array, for any of the following: REC_LOST_REDUNDANCY_DRIVE(xx.66.1076) REC_PATH_DEGRADED(xx.66.1032) If these faults exist, in addition to the RPA Memory Error, the error may be false. Continue to Step 8 to review your firmware revisions. If these faults do not exist on your array, go to Step 10. 8. Verify your array firmware. Use the following document to check your firmware against the table below: Document: 1021067.1 Verify Sun Storage[TM] Array Firmware via the User Interface
Array Model Sun StorageTek[TM] Flexline 380 Sun StorageTek[TM] 6540 Sun StorageTek[TM] 6140 Sun StorageTek[TM] Flexline 380 Sun StorageTek[TM] 6540 Sun StorageTek[TM] 6140 Sun StorageTek[TM] Flexline 380 Sun StorageTek[TM] 6540 Sun StorageTek[TM] 6140 Sun Storage[TM] 6580/6780 Sun Storage[TM] 6580/6780 Sun StorageTek[TM] 2510/2530/2540 Sun StorageTek[TM] 2510/2530/2540 Sun StorageTek[TM] 2510/2530/2540 Firmware 07.50.xx.xx 07.60.xx.xx 07.10.xx.xx 07.15.xx.xx 06.60.xx.xx 06.19.xx.xx 06.16.xx.xx 06.15.xx.xx 07.50.xx.xx 07.60.xx.xx 07.30.xx.xx Action You are not exposed to the bugs. The controller should be replaced. Continue to Step 10.

You are exposed to 6767241, which causes false RPA Memory Errors, along with the faults in Step 6. To correct the condition, go to Step 9. You are not exposed to the bugs. The controller should be replaced. Continue to Step 10.

You are not exposed to the bugs. The controller should be replaced. Continue to Step 10. You are exposed to 6767241, which causes false RPA Memory Errors, along with the faults in Step 6. To correct the condition, go to Step 9.

07.35.50.10 You are not exposed to the bugs. The controller should be replaced. Continue to Step 10. 07.35.55.10 07.35.44.10 07.35.10.10 You are exposed to 6767241, which causes false RPA Memory Errors, along with the faults in Step 6. To correct the condition, go to Step 9. 06.70.xx.xx 06.17.xx.xx You are not exposed to the bugs. The controller should be replaced. Continue to Step 10.

9. Power Cycle the RAID Controller Tray to clear the false RPA memory error. This procedure requires an outage, as the surviving controller will hold the faulted controller in a fault state. The RAID Tray and only the RAID Tray require a power cycle. After performing a power cycle of the RAID Tray, review the Critical Fault list in your user interface. If the fault persists, the controller will require replacement, continue to Step 10. If the fault is cleared, update your firmware to a version where 6767241 is fixed. 2510/2530/2540 Arrays this is 07.35.44.10 or later 6580/6780/6140/6540/Flexline 380 07.50.08.10 or later 10. Have the controller replaced. You have indicated that the 7-segment display on the array controller or a critical fault for an RPA Memory Error indicate that the RAID controller requires replacement. Please supply: Critical Fault 7-Segment display Array Support Data Collection:

https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=TROUBL...

1/31/2012

Page 4 of 4

Reference Document: 1002514.1 Collecting Support Data for Arrays Using Sun StorageTek[TM] Common Array Manager Reference Document: 1014074.1 Collecting Support Data for Arrays Using Sun StorageTek[TM] SANtricity Storage Manager and contact Oracle. 11. Provide Data for further analysis At this point you have validated that each troubleshooting step is true for your environment and the issue still exists. Therefore further troubleshooting is required to identify the issue. Please provide: 7-Segment Display if available Array Support Data Collection: Reference Document: 1002514.1 Collecting Support Data for Arrays Using Sun StorageTek[TM] Common Array Manager Reference Document: 1014074.1 Collecting Support Data for Arrays Using Sun StorageTek[TM] SANtricity Storage Manager and contact Oracle.

Related

Products Sun Microsystems > Storage - Disk > Modular Disk - 2xxx Arrays > Sun Storage 2510 Array Sun Microsystems > Storage - Disk > Modular Disk - 2xxx Arrays > Sun Storage 2530 Array Sun Microsystems > Storage - Disk > Modular Disk - 2xxx Arrays > Sun Storage 2530-M2 Array Sun Microsystems > Storage - Disk > Modular Disk - 2xxx Arrays > Sun Storage 2540 Array Sun Microsystems > Storage - Disk > Modular Disk - 2xxx Arrays > Sun Storage 2540-M2 Array Sun Microsystems > Storage - Disk > Modular Disk - 6xxx Arrays > Sun Storage 6130 Array Sun Microsystems > Storage - Disk > Modular Disk - 6xxx Arrays > Sun Storage 6140 Array Sun Microsystems > Storage - Disk > Modular Disk - 6xxx Arrays > Sun Storage 6540 Array Sun Microsystems > Storage - Disk > Modular Disk - 6xxx Arrays > Sun Storage 6180 Array Sun Microsystems > Storage - Disk > Modular Disk - 6xxx Arrays > Sun Storage 6580 Array Sun Microsystems > Storage - Disk > Modular Disk - 6xxx Arrays > Sun Storage 6780 Array Sun Microsystems > Storage - Disk > Modular Disk - Flexline (FLX/FLA/FLC) Arrays > Sun Storage Flexline 210 Array Sun Microsystems > Storage - Disk > Modular Disk - Flexline (FLX/FLA/FLC) Arrays > Sun Storage Flexline 240 Array Sun Microsystems > Storage - Disk > Modular Disk - Flexline (FLX/FLA/FLC) Arrays > Sun Storage Flexline 280 Array Sun Microsystems > Storage - Disk > Modular Disk - Flexline (FLX/FLA/FLC) Arrays > Sun Storage Flexline 380 Array

Keywords ARRAY; COMMON ARRAY MANAGER; RAID; SANTRICITY; TROUBLESHOOT

Back to top Copyright (c) 2007, 2010, Oracle. All rights reserved. Legal Notices and Terms of Use | Privacy Statement

https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=TROUBL...

1/31/2012

Вам также может понравиться