Вы находитесь на странице: 1из 3

1/10/2020 Document 1002526.

1
Copyright (c) 2020, Oracle. All rights reserved. Oracle Confidential.

Sun SPARC Enterprise[TM] M3000/M4000/M5000/M8000/M9000 Servers: FMA Specified FRU Replacements (Doc ID 1002526.1)

APPLIES TO:

Sun SPARC Enterprise M9000-64 Server - Version All Versions and later
Sun SPARC Enterprise M9000-32 Server - Version All Versions and later
Sun SPARC Enterprise M3000 Server - Version All Versions and later
Sun SPARC Enterprise M5000 Server - Version All Versions and later
Sun SPARC Enterprise M8000 Server - Version All Versions and later
All Platforms

GOAL

Investigating a Sun SPARC[TM] Enterprise M3000/M4000/M5000/M8000/M9000 FMA specified FRU indictment.

This document details how to initiate a Service Action Plan to investigate whether a hardware component should be replaced as implicated by the Predictive Self-Healing
Diagnosis Engine(FMA DE) on a Sun SPARC Enterprise M3000/M4000/M5000/M8000/M9000 system.

NOTE: The implicated hardware component(s) is referred as a Field Replaceable Unit (FRU) throughout this document.

SOLUTION

This document makes a few assumptions:

An error event caused an automated recovery action to take place on an OPL system (panic/reboot/errors/etc).
The FMA DE determined that a FRU(s) is Faulty and may have automatically disabled or deconfigured the suspect FRU(s).
The FMA DE produced a FMA Event Code which when looked up in My Oracle Support recommends replacing a FRU and may refer to this document.

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support
Community - M Series Servers

Some hardware component in a Sun SPARC Enterprise M3000/M4000/M5000/M8000/M9000 (OPL) platform are Field Replaceable Unit (FRU). This requires that a
Oracle "badged" or certified partner engineer performs the physical replacement of the component. In order to begin the process of FRU replacement, there are
specific steps that Oracle Support Services rely on the customer to perform as detailed below.
1. Collect the FMA Fault Message.

The output can be displayed using fmdump -m on the XSCF console. Example output is as follows:
https://support.oracle.com/epmos/faces/DocumentDisplay?_adf.ctrl-state=b5j5lqmpc_394&id=1002526.1 1/3
1/10/2020 Document 1002526.1
MSG-ID: SCF-8001-4X, TYPE: Fault, VER: 1, SEVERITY: Major
EVENT-TIME: Tue Mar 20 21:23:54 UTC 2007
PLATFORM: SPARC-Enterprise, CSN: <SERIAL_NUMBER>, HOSTNAME: <HOSTNAME>
SOURCE: sde, REV: 1.12
EVENT-ID: <UUID>
DESC: The number of uncorrectable and correctable errors on single DIMM exceeds an
acceptable threshold. This fault is detected while running POST.
Refer to http://www.sun.com/msg/SCF-8001-4X for more information.
AUTO-RESPONSE: The memory associated with the memory bank containing the errors is deconfigured.
IMPACT: POST is restarted after the memory associated with the memory bank has been deconfigured.
REC-ACTION: Schedule a repair action to replace the affected Field Replaceable Unit (FRU),
the identity of which can be determined using fmdump -v -u EVENT_ID.
Please consult the detail section of the knowledge article for additional information.

2. Collect the " fmdump -v -u " output relating to the fault event.
Example (uses the same event as Step 1's example - note the event ID is in bold ):
xscf> fmdump -v -u <UUID>
TIME UUID MSG-ID
Mar 20 21:23:54.0192 <UUID> SCF-8001-4X
100% fault.chassis.SPARC-Enterprise.memory.bank.err
Problem in: hc:///chassis=0/cmu=0/mem=0
Affects: hc:///chassis=0/cmu=0/mem=0
FRU: hc://:product-id=SPARC-Enterprise:chassis-id=<SERIAL_NUMBER>:
server-id=san-ff2-21-0:serial=04126711:
part=72T128000HR3.7A:revision=252b/component=/MBU_B/MEMB#0/MEM#3A

3. Collect the fault information to prepare to log a Service Request:

The "FMA Fault Message" from step 1.


The "fmdump" output (from step 2).
Specify whether the first FRU in fmdump output has been recently serviced, replaced, or errored.
Mention this document was referenced in opening the request.
Specify your contact information so the Oracle Support Services engineer can contact you to schedule the service.

4. Contact Oracle Support Services or your local service representative and open a "Service Request".

5. Review FRU Replacement Methods information to prepare your configuration for the FRU replacement.

Components can be replaced using three different FRU Replacement Methods depending on which platform is involved, the specific FRU in question, and whether it is
redundantly configured.
It is recommended to review Document: 1003993.1 Sun SPARC Enterprise M3000/M4000/M5000/M8000/M9000: Field Replaceable Unit (FRU) Replacement Methods to be
aware of these replacement methods and prepare for the service.

6. A Oracle Support Services Engineer may need additional data to be collected. If so they will specify the data to collect.

Please assist in capturing requested data so Oracle can resolve your issue with as little delay as possible. The most likely data requested will be:

XSCF Snapshot data.

https://support.oracle.com/epmos/faces/DocumentDisplay?_adf.ctrl-state=b5j5lqmpc_394&id=1002526.1 2/3
1/10/2020 Document 1002526.1

Domain Explorer

Reference Document: 1533993.1 Collect XSCF snapshot(s) by running STB7.3 (or newer ) domain Explorer on SPARC Enterprise M3000/M4000/M5000/M8000/M9000
(OPL) Servers (Doc ID 1533993.1)

FMA data is maintained on the XSCFU to provide error history and to enhance troubleshooting of current issues. There are no regular mode utilities provided to
clear the information.

REFERENCES

NOTE:1533993.1 - Collect XSCF snapshot(s) by Running STB7.3 (or Newer) Domain Explorer on SPARC Enterprise M3000/M4000/M5000/M8000/M9000 (OPL) Servers
NOTE:1332409.1 - How to Repair FMA Module Errors Seen in 'fmadm faulty'
NOTE:1008229.1 - Gathering Diagnostic Data for SPARC Enterprise M3000/M4000/M5000/M8000/M9000 (OPL) Servers

NOTE:1012954.1 - Sun SPARC Enterprise[TM] M3000/M4000/M5000/M8000/M9000: Information & Troubleshooting FMSP Faults.
NOTE:1007101.1 - Sun SPARC(R)Enterprise M3000/M4000/M5000/M8000/M9000 (OPL) Servers: Fault Clearing and LEDs Behavior

NOTE:1003993.1 - Sun SPARC Enterprise[TM] M3000/M4000/M5000/M8000/M9000: Field Replaceable Unit (FRU) and Customer Replaceable Unit (CRU) Replacement
Methods

Didn't find what you are looking for?

https://support.oracle.com/epmos/faces/DocumentDisplay?_adf.ctrl-state=b5j5lqmpc_394&id=1002526.1 3/3

Вам также может понравиться