On Wed, Dec 11, 2013 at 12:27:25PM -0500, Dave Jones wrote:
> Powered up my desktop this morning and noticed I couldn't cd into ~/Mail
> dmesg didn't look good. "XFS: Internal error XFS_WANT_CORRUPTED_RETURN"
They came from xfs_dir3_block_verify() on read IO completion, which
indicates that the corruption was on disk and in the directory
structure. Yeah, definitely a verifier error:
XFS (sda3): metadata I/O error: block 0x2e790 ("xfs_trans_read_buf_map") error
117 numblks 8
Are you running a CRC enabled filesystem? (i.e. mkfs.xfs -m crc=1)
Is there any evidence that this verifier has fired in the past on
write? If not, then it's a good chance that it's a media error
causing this, because the same verifier runs when the metadata is
written to ensure we are not writing bas stuff to disk.
> I rebooted into single user mode, and ran xfs_repair on /dev/sda3 (/home).
> It fixed up a bunch of stuff, but ended up eating ~/.procmailrc entirely
> (no sign of it in lost & found), and a bunch of filenames got garbled
> 'december' became 'decemcer' for eg. Looks like a couple kernel trees ended
> up in lost & found.
Single bit errors in directory names? That really does point towards
media errors, not a filesystem error being the cause.
> After rebooting back into multi-user mode, I looked in dmesg again to be sure
> and this time sda2 was complaining..
Exaclty the same - directory blocks failing read verification.
> Same drill, reboot, xfs_repair. Looks like a bunch of man pages ended up in
> lost & found.
> Thoughts ? Could sda be dying ? (It is a fairly old crappy ssd)
I'd seriously be considering replacing the SSD as the first step.
If you then see failures on a known good drive, we'll need to dig