On Wed, Aug 13, 2014 at 11:42:26AM +0200, Richard Neuboeck wrote:
> for some time now our storage machine using XFS stops the file
> system due to some reason I don't seem to have found so far. In this
> process the file system gets corrupted and the attached trace log is
What's the workload the VM runs?
> After xfs_repair is run it's running again for an always
> changing amount of time.
What errors does xfs_repair correct? Can you post the output of a
repair run that corrects the issue.
> In general it fails within a few hours or
> days. There are no relevant log messages before the entries shown
> below and no immediate actions that lead to this condition. So far
> my experiments (Ubuntu upgrade from 10.04 to 14.04, different kernel
> versions, changes to the hypervisor) didn't show any lasting effects
> (positive or negative). If any one could shed some light on what XFS
> is trying to tell me it would be highly appreciated.
The directory is trying to read a block of data that does not
contain directory data. i.e. the directory has somehow been
corrupted. The block contains file data, but that's about all
I can tell you right now.
> I've found the mention of 'xfs_dir3_data_reada_verify' in the
> mailing list but didn't find a solution that was applicable.
It's just checking the block read from disk.
However, that's not the only error that is occurring:
> [ 5247.327164] XFS (vdb): metadata I/O error: block 0x160003e488
> ("xfs_trans_read_buf_map") error 117 numblks 8
> [ 5252.482540] XFS: Internal error XFS_WANT_CORRUPTED_GOTO at line 1602 of
> file /build/buildd/linux-3.13.0/fs/xfs/xfs_alloc.c. Caller 0xffffffffa0088485
There are corrupted free space btrees. In this case, the by-bno tree
has been found to be inconsistent. So there's something corrupting
more than just the directory.
SO, more information needed. Lets start with:
and the output of xfs_repair. Also, a metadump image of the
filesystem before you run repair would be helpful. And finally, the
configuration of the block devices the VM is using (i.e. virtio,
cache=?, etc). Describing the physical storage the VM is using might
also be helpful - it could be host based corruption, not guest based
corruption that is occurring...
> xfs mailing list