On 7/17/15 9:46 PM, Rhorer, Leslie wrote:
> I have a 24T XFS file system that is very sick, and seemingly getting
> sicker. I believe it to be the file system itself. I have replaced
> the RAID chassis, the OS, the cables, the drive controller, and most
> of the drives. Re-syncing the RAID array complete in a reasonable
> time, given the size of the array, and reports no mismatches.
> Xfs_repair completes, usually with no errors found, or sometimes one
> or two errors. Some commands, like a df, are now hanging. Writes are
> often failing with I/O errors. I haven't found any amount of obvious
> file corruption, but performing a CRC check using md5sum, md6sum,
> sha256sum, etc., come up with different values every time they are
> run on many large files. What can I do to try to rectify this?
If writes fail with I/O errors, that should show up in dmesg, but I don't
see any such messages.
What did repair find?
Not a lot to go on from the above narrative, I'm afraid. What large
files are those? I presume that you are sure they should not be changing?
Thanks for all the info below...
>From the dmesg, every stuck process is stuck on nfs - doesn't look xfs
related at all.
Doesn't seem like an xfs problem, TBH, but maybe you can provide xfs_repair
output and/or dmesg when writes fail, that might offer a clue.
-Eric
|