xfs
[Top] [All Lists]

Re: BUG: Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_

To: Eryu Guan <eguan@xxxxxxxxxx>
Subject: Re: BUG: Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Tue, 30 Aug 2016 12:39:05 +1000
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20160829103754.GH27776@xxxxxxxxxxxxxxxxxxxxxxxx>
References: <20160829103754.GH27776@xxxxxxxxxxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Mon, Aug 29, 2016 at 06:37:54PM +0800, Eryu Guan wrote:
> Hi,
> 
> I've hit an XFS internal error then filesystem shutdown with 4.8-rc3
> kernel but not with 4.8-rc2
.....
> I attached a script too to reproduce it. Please note that the XFS
> partition needs about 40G frees space, and it may take hours to finish
> based on your memory setup on your host.

Ugh. can you try to narrow the cause so it takes less time to
reproduce? This is almost certainly one of two things:

        1) a ENOSPC issue where an AG is almost-but-not-quite full,
        but fixing up the freelist results in there being not enough
        blocks left to allocate the data extent; or

        2) we've split a delalloc extent so many times that we've
        run out of indirect block reservation and we hit ENOSPC as a
        result.

For the latter, I suspect a test case where we take a large delalloc
range and use sync_file_range to do single page writeback to "binary
split" the delalloc range. i.e. start with a 128MB delalloc, then
sync a 4k block at offset 64MB, then 4k at 32MB, then 16MB, then
8MB, ... all the way down to writing the first block in the file,
and also all the way up to the final block in the file.

Then write every second 4k block to cause worse case growth of the
bmbt and hopefully then exhaust the indirect block reservation for
that delalloc region...

> [root@hp-dl360g9-15 ~]# xfs_info /
> meta-data=/dev/mapper/systemvg-root isize=256    agcount=16, agsize=2927744 
> blks
>          =                       sectsz=512   attr=2, projid32bit=1
>          =                       crc=0        finobt=0 spinodes=0
> data     =                       bsize=4096   blocks=46843904, imaxpct=25
>          =                       sunit=64     swidth=192 blks
> naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
> log      =internal               bsize=4096   blocks=22912, version=2
>          =                       sectsz=512   sunit=64 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0

Does it reproduce on a CRC enabled filesystem?

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>