Вы находитесь на странице: 1из 2

Troubleshooting RAID Multiple Drive Failures

When addressing a multiple drive failure, there are several key pieces of inform
ation that need to be determined prior to performing any state modifications.
RAID Level
o Is it a RAID 6?
RAID 6 volume group failures occur after 3 drives have failed in the vol
ume group
o Is it a RAID 3/5 or RAID 1?
RAID 5 volume group failures occur after two drives have failed in an vo
lume group.
o RAID 1 volume group failures occur when enough drives fail to cause an
incomplete mirror.
This could be as few as two drives or half the drives + 1.
o RAID 0 volume groups are dead upon the first drive failure
Despite the drive failures is each individual volume group configuration complet
e?
i.e. Are all drives accounted for, regardless of failed or optimal?
How many drives have failed and what volume group does each drive belong?
In what order did the drives fail in each individual volume group?
Are there any global hot spares?
o Are any of the hot spares in use
o Are there any hot spares not in use and if so are they in an optimalco
ndition?
Are there any backend errors that lead to the initial drive failures?
o This is the most common cause of multiple drive failures, all backend
issues must be fixed or isolated before continuing any further
Multiple Drive Failures

Why RAID Level is Important

RAID 6 Volume Groups


o RAID 6 volume groups can survive 2 drive failures due to the p+q redun
dancy model, after the third drive failure the volume group is
marked as failed
o Up until the third drive failure, data in the stripe is consistent acr
oss the drives
RAID 5 and RAID 3 Volume Groups
o After the second drive failure the volume group and associated volumes
are marked as failed, no I/Os have been accepted since the
second drive fai
led
o Up until the second drive failure, data in the stripe is consistent ac
ross the drives
RAID 1 Volume Groups
o RAID 1 volume groups can survive multiple drive failures as long as on
e side of the mirror is still optimal
o RAID 1 volume groups can be failed after only two drives fail if both
the data drive and the mirror drive fail

o Until the mirror becomes incomplete the RAID 1 pairs will function nor
mally
RAID 0
o As there is no redundancy these arrays cannot generally be recovered.
However, the drives can be revived and checked
no guarantees can be made
that the data will be recovered.
Multiple Drive Failures

Configuration Considerations

Although there are several mechanisms to ensure configuration integrity there ar


e failure scenarios that may result in configuration corruption
If the failed volume group s configuration is incomplete, reviving and reconstruct
ing drives could permanently corrupt user data
If any of the drives have an
them to an unassigned state

offline

status (06.xx), reviving drives could revert

How can this be avoided?


o Check to see if the customer has an old profile that shows the appropr
iate configuration for the failed volume group(s)
o If the volume group configuration appears to be incomplete, corrupted,
or if there is any doubt
escalate immediately

Вам также может понравиться