xfs
[Top] [All Lists]

Re: File System Corruption - Internal error xfs_dir3_data_reada_verify

To: Richard Neuboeck <hawk@xxxxxxxxxxxxxxxx>
Subject: Re: File System Corruption - Internal error xfs_dir3_data_reada_verify
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Wed, 13 Aug 2014 20:42:55 +1000
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <53EB3302.1090000@xxxxxxxxxxxxxxxx>
References: <53EB3302.1090000@xxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Wed, Aug 13, 2014 at 11:42:26AM +0200, Richard Neuboeck wrote:
> Hi,
> 
> for some time now our storage machine using XFS stops the file
> system due to some reason I don't seem to have found so far. In this
> process the file system gets corrupted and the attached trace log is
> shown.

What's the workload the VM runs?

> After xfs_repair is run it's running again for an always
> changing amount of time. 

What errors does xfs_repair correct? Can you post the output of a
repair run that corrects the issue.

> In general it fails within a few hours or
> days. There are no relevant log messages before the entries shown
> below and no immediate actions that lead to this condition. So far
> my experiments (Ubuntu upgrade from 10.04 to 14.04, different kernel
> versions, changes to the hypervisor) didn't show any lasting effects
> (positive or negative). If any one could shed some light on what XFS
> is trying to tell me it would be highly appreciated.

The directory is trying to read a block of data that does not
contain directory data. i.e. the directory has somehow been
corrupted. The block contains file data, but that's about all
I can tell you right now.

> I've found the mention of 'xfs_dir3_data_reada_verify' in the
> mailing list but didn't find a solution that was applicable.

It's just checking the block read from disk.

However, that's not the only error that is occurring:

> [ 5247.327164] XFS (vdb): metadata I/O error: block 0x160003e488 
> ("xfs_trans_read_buf_map") error 117 numblks 8
> [ 5252.482540] XFS: Internal error XFS_WANT_CORRUPTED_GOTO at line 1602 of 
> file /build/buildd/linux-3.13.0/fs/xfs/xfs_alloc.c.  Caller 0xffffffffa0088485

There are corrupted free space btrees. In this case, the by-bno tree
has been found to be inconsistent. So there's something corrupting
more than just the directory.

SO, more information needed. Lets start with:

http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

and the output of xfs_repair. Also, a metadump image of the
filesystem before you run repair would be helpful. And finally, the
configuration of the block devices the VM is using (i.e. virtio,
cache=?, etc). Describing the physical storage the VM is using might
also be helpful - it could be host based corruption, not guest based
corruption that is occurring...

Cheers,

Dave.

> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs


-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>