Okay...
Just had an error on a production fileserver. If I'm reading
the error correctly, it means that there was an I/O error on the
log device, which caused XFS to shut down. The I/O error
appears to have been the result of XFS sending multiple requests
for the same block to the RAID subsystem (whatever that means).
I killed all daemons accessing the device, unmounted the array,
ran xfs_repair, and put the array back into service.
Did I do the right thing?
Is my diagnosis correct?
Here's the relevant entry from my /etc/fstab:
/dev/md2 /n/bubba1 xfs rw,defaults,logbufs=4,logdev=/dev/md3 0 0
...and here are the error messages:
Oct 31 20:29:15 bubba kernel: raid5: multiple 1 requests for sector 65277048
Oct 31 21:24:00 bubba kernel: I/O error in filesystem ("md(9,2)") meta-data dev
0x903 block 0x17f07
Oct 31 21:24:00 bubba kernel: ("<NULL>") error -1070893103 buf count 5
Oct 31 21:24:00 bubba kernel: xfs_force_shutdown(md(9,2),0x2) called from line
940 of file xfs_log.c. Return address = 0xc01c329c
Oct 31 21:24:00 bubba kernel: Log I/O Error Detected. Shutting down
filesystem: md(9,2)
Hmmm.... I notice there's a long delay between the "multiple 1
requests..." and the filesystem shutdown. Perhaps they have
nothing to do with each other - and yet that's the only
"multiple 1 requests..." error in the past week of logs.
Andrew Klaassen
|