On Fri, Jul 22, 2016 at 06:19:25PM +0000, Stockley, Jonathan wrote:
> Hi,
> I just ran into this error while testing an OpenStack SWIFT deployment.
>
> [130004.933449] XFS (loop1): Metadata corruption detected at
> xfs_attr3_leaf_write_verify+0xe5/0x100 [xfs], block 0x468d0c8
> [130004.936209] XFS (loop1): Unmount and run xfs_repair
> [130004.937477] XFS (loop1): First 64 bytes of corrupted metadata buffer:
> [130004.939113] ffff880111ddd000: 00 00 00 00 00 00 00 00 fb ee 00 00 00 00
> 00 00 ................
> [130004.941242] ffff880111ddd010: 10 00 00 00 00 20 0f e0 00 00 00 00 00 00
> 00 00 ..... ..........
That's a empty attribute leaf block. It probably should be stale
and hence never written to disk. The verifier caught it before it
could be written, so probably just saved you from on-disk
data/filesystem corruption.
> [130004.943327] ffff880111ddd020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00 00 ................
> [130004.945393] ffff880111ddd030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00 00 ................
> [130004.947565] XFS (loop1): xfs_do_force_shutdown(0x8) called from line 1249
> of file
> /build/linux-lts-vivid-vt3Z1H/linux-lts-vivid-3.19.0/fs/xfs/xfs_buf.c.
> Return address = 0xffffffffc0752c92
> [130004.951692] XFS (loop1): Corruption of in-memory data detected. Shutting
> down filesystem
Loops devices are an interesting choice for a production workload
like this. Why?
> Environment information:
> Ubuntu Server 14.04 LTS
> $ uname -a
> Linux 3e2116e0-b4e8-4666-be70-5ddf9c9d9d2b 3.19.0-49-generic
> #55~14.04.1hf1533043v20160201b1-Ubuntu SMP Mon Feb 1 20:41:00 UT x86_64
> x86_64 x86_64 GNU/Linux
A vendor kernel of some kind - have you reported the problem to
Ubuntu, to see if they've already backported a fix?
> I am able to reproduce the problem as follows:
>
> * created a VM based SWIFT cluster
Not something I can do to reproduce here.
....
> In my two test runs the XFS failure occurred around 9 hours after the test
> was started.
>
> It looks like I can reproduce the problem, albeit over an extended period of
> time.
> What can I do to gather more info? Any debug options I can enable that might
> help?
First of all, add all the stuff missing from here:
http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
Your could also probably run a XFs build that contains all the debug
warnings (CONFIG_XFS_WARN=y) and see if that triggers something.
You could laso try a more recent kernel (e.g. 4.6) and see if that
has the same problem.
If it still occurs, then you are probably going to need to narrow
this down to a much, simpler and more targeted reporducer for
Cheers,
Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
|