xfs
[Top] [All Lists]

Re: 3.9.0: XFS rootfs corruption

To: Eric Sandeen <sandeen@xxxxxxxxxxx>
Subject: Re: 3.9.0: XFS rootfs corruption
From: CAI Qian <caiqian@xxxxxxxxxx>
Date: Wed, 22 May 2013 00:10:07 -0400 (EDT)
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <51895115.90108@xxxxxxxxxxx>
References: <1871204531.7584919.1367826613792.JavaMail.root@xxxxxxxxxx> <5187BEA5.4040107@xxxxxxxxxxx> <647316680.8155487.1367913231441.JavaMail.root@xxxxxxxxxx> <51895115.90108@xxxxxxxxxxx>
Thread-index: IdWnK93hlt2QsMnf3gyQGdpEp61MFA==
Thread-topic: 3.9.0: XFS rootfs corruption
OK, this has never been reproduced in 3.9-rc1 so far. It may because the
rootfs became full after crash dump testing though.
CAI Qian

----- Original Message -----
> From: "Eric Sandeen" <sandeen@xxxxxxxxxxx>
> To: "CAI Qian" <caiqian@xxxxxxxxxx>
> Cc: xfs@xxxxxxxxxxx
> Sent: Wednesday, May 8, 2013 3:08:05 AM
> Subject: Re: 3.9.0: XFS rootfs corruption
> 
> On 5/7/13 2:53 AM, CAI Qian wrote:
> > 
> > 
> > ----- Original Message -----
> >> From: "Eric Sandeen" <sandeen@xxxxxxxxxxx>
> >> To: "CAI Qian" <caiqian@xxxxxxxxxx>
> >> Cc: xfs@xxxxxxxxxxx
> >> Sent: Monday, May 6, 2013 10:31:01 PM
> >> Subject: Re: 3.9.0: XFS rootfs corruption
> >>
> >> On 5/6/13 2:50 AM, CAI Qian wrote:
> >>> Saw this on several different Power7 systems after kdump reboot. It has
> >>> xfsprogs-3.1.10
> >>> and rootfs in on LVM. Never saw one of those in any of the RC releases.
> >>>
> >>> ] Reached target Basic System.
> >>> [    4.919316] bio: create slab <bio-1> at 1
> >>> [    5.078616] SGI XFS with ACLs, security attributes, large block/inode
> >>> numbers, no debug enabled
> >>> [    5.081925] XFS (dm-1): Mounting Filesystem
> >>> [    5.168530] XFS (dm-1): Starting recovery (logdev: internal)
> >>> [    5.333575] XFS: Internal error XFS_WANT_CORRUPTED_RETURN at line 176
> >>> of
> >>> file fs/xfs/xfs_dir2_data.c.  Caller 0xd000000002396fdc
> >>
> >> here:
> >>
> >>         /*
> >>          * Need to have seen all the entries and all the bestfree slots.
> >>          */
> >>         XFS_WANT_CORRUPTED_RETURN(freeseen == 7);
> >>
> >> I hope Dave knows offhand what this might mean.  :)
> >>
> >> Could you get a metadump of the filesystem in question?
> > Err, less familiar here. May I ask how can I do that?
> 
> since it's the root fs, you might need to do it from some sort of rescue
> shell, then just do xfs_metadump /dev/<device> <metadump filename>
> 
> the resulting file should compress further with something like bzip2.
> 
> ...
> 
> >>> Also, never saw any of those in other architectures like x64, but started
> >>> get those there in 3.9.0.
> >>> Unsure if those are related.
> >>>
> >>> [ 3224.369782]
> >>> =============================================================================
> >>> [ 3224.370017] BUG xfs_efi_item (Tainted: GF   B       ): Poison
> >>> overwritten
> >>> [ 3224.370017]
> >>> -----------------------------------------------------------------------------
> >>
> >>   2: 'F' if any module was force loaded by "insmod -f", ' ' if all
> >>      modules were loaded normally.
> >>
> >> Force loaded modules, what's that from?
> > This could be just happened after the booting done or we were running a
> > stress test later
> > that does load (modprobe *) and unload (modprobe -r *) every module. Again,
> > those warnings
> > could be totally unrelated to the above rootfs corruption.
> > CAI Qian
> 
> hmmm :)  So any one of those modules could have caused memory corruption I
> guess.
> 
> If you can hit it reliably you might try to narrow it down to whether it
> is a particular module causing it.
> 
> -Eric
> 
> 

<Prev in Thread] Current Thread [Next in Thread>