Академический Документы
Профессиональный Документы
Культура Документы
com/)
P O R TA L
Environment
VMware ESX/ESXi hypervisor environment
Red Hat Enterprise Linux(RHEL) 4, 5, 6, & 7 virtual guest in VMware environment
Issue
File systems became read-only on virtual guests on VMware.
The readonly filesystem was caused by temporary disconnections to SAN disks, VM crashed.
how can we avoid it?
Read only filesystem is occurring fairly regularly.
Root file system went to read only mode in RHEL 4 VM running on VMware and server got
panic. Need to reboot the server to overcome the situation, how can we avoid this behaviour?
Filesystem went to ready only mode, we had following SCSI I/O error events logged in syslog
messages:
Red Hat Enterprise Linux VmWare ESX Guests located on SAN going Read-Only,
Resolution
Increase the SCSI timeout of each disk presented from VMWare
Create a rule in /etc/rc.d/rc.local that changes the timeout value of all system disks
when the system boots:
# cat /etc/rc.d/rc.local
#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.
touch /var/lock/subsys/local
# Increase SCSI timeout on all SCSI disks attached to the system at boot time
echo 180 > /sys/block/sd*/device/timeout
Create your own udev rule that changes the timeout whenever a disk is attached to the system.
This includes during boot and while the system is running.
# cat /etc/udev/99-scsi-timeout.rules
KERNEL="sd*", SYSFS{vendor}="VMware", SYSFS{model}="Virtual Disk", NAME="%k",
PROGRAM="/bin/sh -c 'echo 180 > /sys/block/%k/device/timeout'"
Another option is to use device-mapper multipath (even for just one path) with the
queue_if_no_path option set. This will cause device-mapper to retry failed I/Os
indefinitely until they succeed preventing errors from propogating back to the filesystem.
The caveat with this approach is that if the storage never comes back then there are I/O
requests trapped in the multipath layer and the filesystem will not be able to be unmounted
without rebooting the OS. It's also possible that more data will be lost because more data
may be queued before any application is aware of a problem.
The 180 second is a suggested starting point. Larger values may be needed depending on storage
type available, the number of guests, priority of guests, total I/O burst load across all guests, and
other factors.
The maximum timeout value within the kernel is 0x7FFFFFFF . But the internal kernel variable is in
milliseconds.
C U S TO
0x7FFFFFFF = 2147483647 E R (https://access.redhat.com/)
/ M1000ms = 2147483 seconds or about 25 days.
P O R TA L
# echo 2147483 > /sys/block/sdc/device/timeout
# cat /sys/block/sdc/device/timeout
2147483
You'd never want to set it that high of course, but its not unusual in VMware environments for the
timeout to be set to 180, 360, 600 or even slightly higher. Again which value is needed is dependent
upon the factors cited: storage type, number of guests, io load, guest priority, etc.
While a process is waiting for I/O to complete it remains in 'D' state. If a process waits more than
hung_task_timeout_secs seconds for the process to exit D state, a warning message and kernel
stack dump is output to messages. Typically with an io default I/O timeout value of 60 seconds, the
hung_task_timeout_secs is set to 120 seconds. If the timeout value is increased, then the
hung_task_timeout_secs should also be increased to prevent hung task output while the timeout
value for I/O has not expired. Setting hung_task_timeout_secs to at least I/O timeout+60
seconds up to twice the timeout value is suggested.
Root Cause
When a storage failover occurs at the VMware hypervisor level, it can sometimes take longer
than the SCSI timeout to expire.
If the SCSI timeout occurs before the storage failover on the hypervisor has completed, the
SCSI layer will abort the write which might make the filesystem go read-only.
Increasing the timeout allows more time for the failover to occur, and if the failover occurs
before the timeout everything will continue as per normal.
Diagnostic Steps
Here we can see an example of a timeout causing a filesystem to do read-only.
kernel: sd 0:0:0:0: M E R (https://access.redhat.com/)
C U S T O timing out command, waited 1080s
P O R TA L
kernel: sd 0:0:0:0: Unhandled error code
kernel: sd 0:0:0:0: SCSI error: return code = 0x06000008
kernel: Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT,SUGGEST_OK
Here the driver informs the SCSI layer that it is okay and it's just a genuine timeout so no need
to reset the controller.
There are some errors and then corruption in the freespace bitmap is detect for block 89637.
kernel: EXT3-fs error (device dm-3): ext3_free_blocks_sb: bit already cleared for
block 89637
kernel: Aborting journal on device dm-3.
kernel: EXT3-fs error (device dm-3) in ext3_reserve_inode_write: Journal has
aborted
kernel: EXT3-fs error (device dm-3) in ext3_truncate: Journal has aborted
kernel: EXT3-fs error (device dm-3) in ext3_reserve_inode_write: Journal has
aborted
kernel: EXT3-fs error (device dm-3) in ext3_orphan_del: Journal has aborted
kernel: EXT3-fs error (device dm-3) in ext3_reserve_inode_write: Journal has
aborted
kernel: EXT3-fs error (device dm-3) in ext3_delete_inode: Journal has aborted
kernel: ext3_abort called.
kernel: EXT3-fs error (device dm-3): ext3_journal_start_sb: Detected aborted
journal
kernel: Remounting filesystem read-only
rhel_4 (/tags/rhel_4) rhel_5 (/tags/rhel_5) rhel_6 (/tags/rhel_6) rhel_7 (/taxonomy/tags/rhel7) san (/tags/san)
scsi (/tags/scsi) storage (/tags/storage) virtualization (/tags/virtualization)
Seeing kernel: scsi host2: hdr status = FCPIO_DATA_CNT_MISMATCH and path failures
7 Comments
29 May 2014 2:40 AM (https://access.redhat.com/solutions/35329#comment-763523)
JO jon_a_ (/user/6447333)
(/user/6447333)
Is there anyway to recover the filesystem once it has gone readonly without rebooting
NEWBIE the guest? (Assuming the underlying issues have been resolved)
10 Points
≤ Reply (/Ajax_comments/Reply/35329/763523)
2 June 2014C4:16
U S TPM E R (https://access.redhat.com/)
O M(https://access.redhat.com/solutions/35329#comment-765153)
P O R TA L
Harold Miller (/user/181543)
(/user/181543)
Jon, https://access.redhat.com/site/solutions/144003 discusses one way, remounting
RED HAT
the file system R/W
ACTIVE
CONTRIBUTOR
There were a couple of VMs that had to fsck the root partition after reboot, I'm assuming
those would require a reboot anyway, but the vms that didn't fsck root on reboot I would
think could be recovered?
≤ Reply (/Ajax_comments/Reply/35329/767703)
≤ Reply (/Ajax_comments/Reply/35329/768173)
Regards,
Naresh
≤ Reply (/Ajax_comments/Reply/35329/973133)