xfs
[Top] [All Lists]

Re: PROBLEM: XFS on ARM corruption 'Structure needs cleaning'

To: Eric Sandeen <sandeen@xxxxxxxxxxx>
Subject: Re: PROBLEM: XFS on ARM corruption 'Structure needs cleaning'
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Fri, 12 Jun 2015 08:53:08 +1000
Cc: Török Edwin <edwin@xxxxxxxxxxxx>, Brian Foster <bfoster@xxxxxxxxxx>, Christopher Squires <christopher.squires@xxxxxxxx>, Wayne Burri <wayne.burri@xxxxxxxx>, Luca Gibelli <luca@xxxxxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <5579EA91.3090707@xxxxxxxxxxx>
References: <5579296A.8010208@xxxxxxxxxxxx> <20150611151620.GB59168@xxxxxxxxxxxxxxx> <5579A904.3020204@xxxxxxxxxxxx> <5579AE85.5080203@xxxxxxxxxxx> <5579B034.4070503@xxxxxxxxxxx> <5579B804.9050707@xxxxxxxxxxxx> <5579EA91.3090707@xxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Thu, Jun 11, 2015 at 03:07:45PM -0500, Eric Sandeen wrote:
> On 6/11/15 11:32 AM, Török Edwin wrote:
> > [4745016.650000] XFS (loop0): metadata I/O error: block 0xa000 
> > ("xfs_trans_read_buf_map") error 117 numblks 8
> 
> ok, block 0xA000 (in sectors) is sector 40960...
> 
> xfs_db> daddr 40960
> xfs_db> fsblock 
> current fsblock is 8192
> xfs_db> type text
> xfs_db> p
> 000:  58 46 53 42 00 00 10 00 00 00 00 00 00 00 28 00  XFSB............
> 010:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 020:  64 23 d2 06 32 2e 4c 20 82 6e f0 36 a7 d9 54 f9  d...2.L..n.6..T.
> 
> ...
> 
> Right, so it's reading the 2nd superblock in xfs_dir3_data_read_verify.  Huh?
> (I could have imagined some weird scenario where we read block 0, but 8192?
> Very strange).
> 
> Hm, I don't think this can be readahead, it'd not get to this verifier AFAICT.
> 
> Given that the image is enough to reproduce via just mount; ls - we should be
> able to reproduce this, given the right hardware, and get to the bottom of it.

OK, so I've had a look at the on disk layout via xfs_db and there is
no corruption present on disk. I've confirmed that by mounting on
x86-64 (on 4.1-rc6) and no errors have been issued, and xfs_repair
(from for-next branch) doesn't see anything wrong, either.

So this is looking like either: a) a kernel bug we've subsequently
fixed; or b) an arm specific compiler or kernel bug.

I'd like to see a) ruled out as quickly as possible. Torok, can you
build a more recent kernel and see if the problem persists?

Can you also let us know what compiler version you are using and how
you are compiling (e.g. cross compile)? ISTR a previous incarnation
of this problem showed up when a specific compiler version was used
to cross-compile the armv7 kernel from x86-64, and it went away with
a compiler update. So it would be good to rule out the
compiler/build environment as the cause, too.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>