xfs
[Top] [All Lists]

Re: Read corruption on ARM

To: Eric Sandeen <sandeen@xxxxxxxxxxx>
Subject: Re: Read corruption on ARM
From: Jason Detring <detringj@xxxxxxxxx>
Date: Wed, 27 Feb 2013 12:15:21 -0600
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=rwGnOjZ11stDMpl2FEMd9FoOmDP0RxWqwPnc+F9gS44=; b=l5t1p0pQFnG2bew3SKj++5uwhcDmwdj7EAM3moGWSBhaXpQ5wEqcWBiPFxrVUSPq2X lzaZKYZTDt88C40GNkJqTiILU8V+r9MwHNo1dig69IKGydhYrFNcDY5ZY2qXgSmGQOSk BG6i0ENnWrrj+ynIlHzy7EKlXhG/JnTUeee/+qUkD/SG8p5bRzGssBejAW+0xRl5eaVK v1HweJ3AiBpwPe2EYFVG3tiljWXjhudOw3wf+ou1ZtsDZLBmokpj+pOjujXgySDnq4qk KTKeF0wvh8bIBVJZ8e6tQ19TuGktTLDHtPnEZ+Hl8F1eABuqipNkWpcI8rQpQie3OkNZ qoIg==
In-reply-to: <512E3BB2.6060407@xxxxxxxxxxx>
References: <CA+AKrqBQ=VG0oVsai+agywDKRgO9cG9AvT6mCTSZxKO3Si5Aiw@xxxxxxxxxxxxxx> <512D3856.5050305@xxxxxxxxxxx> <CA+AKrqC+6nXuCxdY08MBLsjv1fOPJ6=1ruTHsfGqxosQmCi_jQ@xxxxxxxxxxxxxx> <512D49E2.40003@xxxxxxxxxxx> <CA+AKrqCrphO-eKy0n=70O9hmB3mXttOsKmTdfRnPxgJM3_PAkQ@xxxxxxxxxxxxxx> <512E3BB2.6060407@xxxxxxxxxxx>
On 2/27/13, Eric Sandeen <sandeen@xxxxxxxxxxx> wrote:
> On 2/27/13 10:28 AM, Jason Detring wrote:
>>             find-502   [000]   207.983594: xfs_da_btree_corrupt: dev 7:0
>> bno 0x5a4f8 nblks 0x8 hold 1 pincount 0 lock 0 flags DONE|PAGES caller
>> xfs_dir2_leaf_readbuf
>
> Was this on the same image as you sent earlier?

Yes, sorry, I should have said that.  I'm now using the demo image
with the RasPi exclusively for testing.


> Ok, so this tells us that it was trying to read sector nr. 0x5a4f8 (369912),
> or fsblock 46239
>
> What's really on disk there?
>
> $ xfs_db problemimage.xfs
> xfs_db> blockget -n
> xfs_db> daddr 369912
> xfs_db> blockuse
> block 49152 (3/0) type sb
> xfs_db> type text
> xfs_db> p
> 000:  58 46 53 42 00 00 10 00 00 00 00 00 00 00 f0 d3  XFSB............
> ...
>
> So it really did have a superblock location that it was reading
> at that point - the backup SB in the 3rd allocation group, to be exact.
> But it shouldn't have been trying to read a superblock at this point
> in the code...
>
> Hm, maybe I should have had you enable all xfs tracepoints to get
> more info about where we thought we were on disk when we were doing this.
> If you used trace-cmd you can do "trace-cmd record -e xfs*" IIRC.
> You can do similar echo 1 > /<blah>/xfs*/enable I think for the sysfs
> route.
>
> Can you identify which directory it was that tripped the above error?

# modprobe xfs-O1-g
# mount -o loop,ro /xfsdebug/problemimage.xfs /loop
# find /loop -type d -print0 > list.txt
# umount /loop
# rmmod xfs
# modprobe xfs-O2-g
# mount -o loop,ro /xfsdebug/problemimage.xfs /loop
# cat list.txt | xargs -0 -P1 -n1 -I{} sh -c '(dir="{}" ; ls "${dir}"
>/dev/null ; sleep 0.1 ; dmesg | tail -n1 | grep Corruption && echo
"${dir} is causing problems")'
ls: reading directory /loop/ruby/1.9.1: Structure needs cleaning
[35689.975822] XFS (loop0): Corruption detected. Unmount and run xfs_repair
/loop/ruby/1.9.1 is causing problems
...


OK, I now have a name.  Rebooting to get a clean slate.

# modprobe xfs-O2-g
# trace-cmd record -e xfs\* &
# mount -o loop,ro /xfsdebug/problemimage.xfs /loop
# ls /loop/ruby/1.9.1
ls: reading directory /loop/ruby/1.9.1: Structure needs cleaning
# umount /loop
# (trace-cmd) ^C
# trace-cmd report > tracecmd-report.txt

Reboot.

# modprobe xfs-O2-g
# echo 1 > /sys/kernel/debug/tracing/tracing_enabled
# for knob in /sys/kernel/debug/tracing/events/xfs/*/enable; do echo 1
> $knob; done
# mount -o loop,ro /xfsdebug/problemimage.xfs /loop/
# ls /loop/ruby/1.9.1
ls: reading directory /loop/ruby/1.9.1: Structure needs cleaning
# umount /loop
# cat /sys/kernel/debug/tracing/trace > sysfstrace-report.txt


> (I still think it sounds like a miscompile, but trying to get more clues)
>
> -Eric

Attachment: tracecmd-dmesg.txt
Description: Text document

Attachment: trace.dat.gz
Description: GNU Zip compressed data

Attachment: tracecmd-report.txt
Description: Text document

Attachment: sysfstrace-dmesg.txt
Description: Text document

Attachment: sysfstrace-report.txt
Description: Text document

<Prev in Thread] Current Thread [Next in Thread>