[Top] [All Lists]

Re: Xfs_repair segfaults.

To: Eric Sandeen <sandeen@xxxxxxxxxxx>
Subject: Re: Xfs_repair segfaults.
From: Filippo Stenico <filippo.stenico@xxxxxxxxx>
Date: Tue, 7 May 2013 20:20:49 +0200
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=IDZlXE2mFlnRSvWbbAwAXG7LMzxczr/IbgDDD6jVHow=; b=LyBX2sRbFlJT49pnPu7jNPNjy7Yjlkxtl9QC4cCaKE7Rb+v3jFKE9i3emujkqAJRIq 0CJWkO2WjcQ5NG4cbyZG3aNf+RLSUI3U6jAoCIU6lma8tE5WTTFoztqGwm8zbv2SCeBl pdPyat0F/nA1i45vNhE9Av7my4JrWf3WDZBifx05Z7qpqzDSKC5O7oqLo07cZnaSHl4J 2et8c13RVVbAnXnnEm9effOHTgIx+WBd8y775bpvg0pTFFMYO4Ww0dLu6BBxp8+acrDr TY4iMqizpl50CPtTXs8ayBPtSR8N2TbhRBMP1jWXg9v42FDxEeOWRTqZ754XUh4/ZC1A p0YQ==
In-reply-to: <CADNx=KvmA7jgqBUO0YvKddHvFaqxHNZKfF3eWajOW8GWKwNhbA@xxxxxxxxxxxxxx>
References: <CADNx=KsT9DC=vveyTZx8EovddFx9mhRS-yzygaORHZ_4VyXfzQ@xxxxxxxxxxxxxx> <5187BF8A.2040303@xxxxxxxxxxx> <CADNx=KvZPhbRn9Kc3+KdoAY7jZ3U0uyG6wgUB5vcxX85CkFdQg@xxxxxxxxxxxxxx> <CADNx=Kv0bt3fNGW8Y24GziW9MOO-+b7fBGub4AYP70b5gAegxw@xxxxxxxxxxxxxx> <5188FF88.6000508@xxxxxxxxxxx> <CADNx=KvmA7jgqBUO0YvKddHvFaqxHNZKfF3eWajOW8GWKwNhbA@xxxxxxxxxxxxxx>
xfs_repair -L -vv -P /dev/mapper/vg0-lv0 does the same kernel panic as my first report. No use to double info on this.
I'll try xfs_repair -L -vv -P -m 2000 to keep memory consuption at a limit.

On Tue, May 7, 2013 at 3:36 PM, Filippo Stenico <filippo.stenico@xxxxxxxxx> wrote:

On Tue, May 7, 2013 at 3:20 PM, Eric Sandeen <sandeen@xxxxxxxxxxx> wrote:
On 5/7/13 4:27 AM, Filippo Stenico wrote:
> Hello,
> this is a start-over to try hard to recover some more data out of my raid5 - lvm - xfs toasted volume.
> My goal is either to try the best to get some more data out of the volume, and see if I can reproduce the segfault.
> I compiled xfsprogs 3.1.9 from deb-source. I ran a xfs_metarestore to put original metadata on the cloned raid volume i had zeroed the log on before via xfs_repair -L (i figured none of the actual data was modified before as I am just working on metadata.. right?).
> Then I ran a mount, checked a dir that I knew it was corrupted, unmount and try an xfs_repair (commands.txt attached for details)
> I went home to sleep, but at morning I found out that kernel paniced due "out of memory and no killable process".
> I ran repair without -P... Should I try now disabling inode prefetch?
> Attached are also output of "free" and "top" at time of panic, as well as the output of xfs_repair and strace attached to it. Dont think gdb symbols would help here....


Ho hum, well, no segfault this time, just an out of memory error?
That's right....
No real way to know where it went from the available data I think.

A few things:

> root@ws1000:~# mount /dev/mapper/vg0-lv0 /raid0/data/
> mount: Structure needs cleaning

mount failed?  Now's the time to look at dmesg to see why.
>From attached logs it seems to be:

> XFS internal error xlog_valid_rec_header(1) at line 3466 of file [...2.6.32...]/fs/xfs/xfs_log_recover.c
> XFS: log mount/recovery failed: error 117

> root@ws1000:~# mount

<no raid0 mounted>

> root@ws1000:~# mount /dev/mapper/vg0-lv0 /raid0/data/
> root@ws1000:~# mount | grep raid0
> /dev/mapper/vg0-lv0 on /raid0/data type xfs (rw,relatime,attr2,noquota)

Uh, now it worked, with no other steps in between?  That's a little odd.
Looks odd to me too. But i just copied the commands issued as they where on my console... so yes, nothing in between. 
It found a clean log this time:

> XFS mounting filesystem dm-1
> Ending clean XFS mount for filesystem: dm-1

which is unexpected.

So the memory consumption might be a bug but there's not enough info to go on here.

> PS. Let me know if you wish reports like this one on list.

worth reporting, but I'm not sure what we can do with it.
Your storage is in pretty bad shape, and xfs_repair can't make something out
of nothing.

I still got back around 6TB out of 7.2 TB of total data stored, so this tells xfs is reliable even when major faults occur...

Thanks anyways, I am trying with a "-L" repair, at this step I expect another fail (due out of memory or something, as it happened last time) then I will try with "xfs_repair -L -vv -P" and I expect to see that segfault again.

Will report next steps, maybe something interesting for you will pop up... for me is not a waste of time, since this last try is worth being made.


<Prev in Thread] Current Thread [Next in Thread>