xfs
[Top] [All Lists]

Re: [PATCH] Re: XFS: Assertion failed: first <= last && last < BBTOB(bp-

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH] Re: XFS: Assertion failed: first <= last && last < BBTOB(bp->b_length), file: fs/xfs/xfs_trans_buf.c, line: 568
From: "Michael L. Semon" <mlsemon35@xxxxxxxxx>
Date: Mon, 26 Aug 2013 16:26:29 -0400
Cc: Brian Foster <bfoster@xxxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=ZwN20ohB3Gabr4Vebz8GtLcb+6X+xiYsAQFvpQ1U9Ow=; b=JTTbYwRMXwv5oGI9G28lckbTIGe9iQ9gotMuVLm6b4bJoko0WxvNkBq3aQELl2GiMB X7wlSQt2mG+PS/3u306Sm91nem2K3UzpeKbmS9SRePXHNdevm5tfYnDvgCei4OjgOei/ xEDN6DlobUHHwSRTDP/JLXKPo4V851S0QLkx+ZQtiONjt71hHEAqPOb+rYu66RbmbagT i0xPWT0iihYarGrUPMdbvmBYCQVoZpMIMLhl6w1CSVrKCuOj6On4ofDLz74UPeeFfI4S 6BJ0tkWQ5feYsxSRZWRJaGpL2l9BI4AyFIXiNNpYWgQ6Xi0Z/dNatJqi2CyS1H4LJtfU Fp1w==
In-reply-to: <20130826041330.GU6023@dastard>
References: <52165830.8050006@xxxxxxxxxx> <20130826041330.GU6023@dastard>
User-agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130801 Thunderbird/17.0.8
On 08/26/2013 12:13 AM, Dave Chinner wrote:
> On Thu, Aug 22, 2013 at 02:28:00PM -0400, Brian Foster wrote:
>> Hi all,
>>
>> I hit an assert on a debug kernel while beating on some finobt work and
>> eventually reproduced it on unmodified/TOT xfs/xfsprogs as of today. I
>> hit it through a couple different paths, first while running fsstress on
>> a CRC enabled filesystem (with otherwise default mkfs options):
>>
>> (These tests are running on a 4p, 4GB VM against a 100GB virtio disk,
>> hosted on a single spindle desktop box).
>>
>> crc=1
>> fsstress -z -fsymlink=1 -n99999999 -p4 -d /mnt/test
>>
>> XFS: Assertion failed: first <= last && last < BBTOB(bp->b_length),
> 
> Directory buffer overrun.
> 
>>  [<ffffffffa031d549>] xfs_trans_log_buf+0x89/0x1b0 [xfs]
>>  [<ffffffffa02e7c1c>] xfs_da3_node_add+0x11c/0x210 [xfs]
>>  [<ffffffffa02ea703>] xfs_da3_node_split+0xc3/0x230 [xfs]
>>  [<ffffffffa02eaa18>] xfs_da3_split+0x1a8/0x410 [xfs]
>>  [<ffffffffa02f743f>] xfs_dir2_node_addname+0x47f/0xde0 [xfs]
> 
> During a split.
> 
> Easily reproduced with "seq 200000 | xargs touch" as Michael Semon
> reported last week.
> 
> The fix demonstrates my concerns about modifying directory code -
> the CRC changes missed a *fundamental* directory format definition,
> and we've only just tripped over it....

Don't fret too much over it.  This test was part of coreutils, which 
is something that I rebuild after a glibc upgrade.  Had glibc-2.18 
been released six weeks ago, then I would have stumbled onto this 
XFS issue six weeks ago.

>> rm -rf /mnt/test
>>
>> XFS: Assertion failed: first <= last && last < BBTOB(bp->b_length),
> 
> Directory buffer overrun.
> 
>>  [<ffffffffa032b549>] xfs_trans_log_buf+0x89/0x1b0 [xfs]
>>  [<ffffffffa02f61ff>] xfs_da3_node_unbalance+0xef/0x1d0 [xfs]
>>  [<ffffffffa02f98b0>] xfs_da3_join+0x240/0x290 [xfs]
>>  [<ffffffffa030659b>] xfs_dir2_node_removename+0x69b/0x8b0 [xfs]
> 
> During a merge. Not sure why that is happening on a v4 filesystem.
> V5 filesystem, yes, due to the above bug but v4 should not be
> affected.
> 
> Cheers,
> 
> Dave.

Your patch looks good, and I even applied it to vanilla 3.10.9, 
along with Jeff Liu's MAX_LFS_FILESIZE patch.  [Murphy's Law states 
that if I didn't use Jeff's patch, then I would run xfstests 
generic/308 on accident, leading to a hung umount.  Happens every 
single time.]  Both patches applied cleanly to kernels on a 2.8 GHz 
i686 Pentium 4 PC that was running Slackware 14.0 Linux.

Naturally, `seq 200000 | xargs touch` was run for v5 and v4 XFS 
file systems.  All was okay.  The removal of the populated directory 
went fine as well.

The v5 file systems were tested using a 3.11-rc7+ git kernel.  
xfstests was run from the start of generic/ through generic/127; 
and that went fine.  Some of the xfs/* series was run but merely 
scanned because the v5-output-cleanup patches were not readily 
available.

The v4 file systems were tested with a patched vanilla 3.10.9 kernel, 
and some of generic was run, with patched and unpatched kernels showing 
the same good results, very little difference in timing overall.

Thanks!

Michael

<Prev in Thread] Current Thread [Next in Thread>