[PATCH] Re: XFS: Assertion failed: first <= last && last < BBTOB(bp->b_length), file: fs/xfs/xfs_trans_buf.c, line: 568

Michael L. Semon mlsemon35 at gmail.com
Mon Aug 26 15:26:29 CDT 2013


On 08/26/2013 12:13 AM, Dave Chinner wrote:
> On Thu, Aug 22, 2013 at 02:28:00PM -0400, Brian Foster wrote:
>> Hi all,
>>
>> I hit an assert on a debug kernel while beating on some finobt work and
>> eventually reproduced it on unmodified/TOT xfs/xfsprogs as of today. I
>> hit it through a couple different paths, first while running fsstress on
>> a CRC enabled filesystem (with otherwise default mkfs options):
>>
>> (These tests are running on a 4p, 4GB VM against a 100GB virtio disk,
>> hosted on a single spindle desktop box).
>>
>> crc=1
>> fsstress -z -fsymlink=1 -n99999999 -p4 -d /mnt/test
>>
>> XFS: Assertion failed: first <= last && last < BBTOB(bp->b_length),
> 
> Directory buffer overrun.
> 
>>  [<ffffffffa031d549>] xfs_trans_log_buf+0x89/0x1b0 [xfs]
>>  [<ffffffffa02e7c1c>] xfs_da3_node_add+0x11c/0x210 [xfs]
>>  [<ffffffffa02ea703>] xfs_da3_node_split+0xc3/0x230 [xfs]
>>  [<ffffffffa02eaa18>] xfs_da3_split+0x1a8/0x410 [xfs]
>>  [<ffffffffa02f743f>] xfs_dir2_node_addname+0x47f/0xde0 [xfs]
> 
> During a split.
> 
> Easily reproduced with "seq 200000 | xargs touch" as Michael Semon
> reported last week.
> 
> The fix demonstrates my concerns about modifying directory code -
> the CRC changes missed a *fundamental* directory format definition,
> and we've only just tripped over it....

Don't fret too much over it.  This test was part of coreutils, which 
is something that I rebuild after a glibc upgrade.  Had glibc-2.18 
been released six weeks ago, then I would have stumbled onto this 
XFS issue six weeks ago.

>> rm -rf /mnt/test
>>
>> XFS: Assertion failed: first <= last && last < BBTOB(bp->b_length),
> 
> Directory buffer overrun.
> 
>>  [<ffffffffa032b549>] xfs_trans_log_buf+0x89/0x1b0 [xfs]
>>  [<ffffffffa02f61ff>] xfs_da3_node_unbalance+0xef/0x1d0 [xfs]
>>  [<ffffffffa02f98b0>] xfs_da3_join+0x240/0x290 [xfs]
>>  [<ffffffffa030659b>] xfs_dir2_node_removename+0x69b/0x8b0 [xfs]
> 
> During a merge. Not sure why that is happening on a v4 filesystem.
> V5 filesystem, yes, due to the above bug but v4 should not be
> affected.
> 
> Cheers,
> 
> Dave.

Your patch looks good, and I even applied it to vanilla 3.10.9, 
along with Jeff Liu's MAX_LFS_FILESIZE patch.  [Murphy's Law states 
that if I didn't use Jeff's patch, then I would run xfstests 
generic/308 on accident, leading to a hung umount.  Happens every 
single time.]  Both patches applied cleanly to kernels on a 2.8 GHz 
i686 Pentium 4 PC that was running Slackware 14.0 Linux.

Naturally, `seq 200000 | xargs touch` was run for v5 and v4 XFS 
file systems.  All was okay.  The removal of the populated directory 
went fine as well.

The v5 file systems were tested using a 3.11-rc7+ git kernel.  
xfstests was run from the start of generic/ through generic/127; 
and that went fine.  Some of the xfs/* series was run but merely 
scanned because the v5-output-cleanup patches were not readily 
available.

The v4 file systems were tested with a patched vanilla 3.10.9 kernel, 
and some of generic was run, with patched and unpatched kernels showing 
the same good results, very little difference in timing overall.

Thanks!

Michael



More information about the xfs mailing list