xfs
[Top] [All Lists]

Re: kernel BUG at fs/xfs/xfs_message.c:113!

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: kernel BUG at fs/xfs/xfs_message.c:113!
From: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
Date: Wed, 21 Sep 2016 09:38:44 -0600
Cc: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx, linux-fsdevel@xxxxxxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20160920230613.GL340@dastard>
Mail-followup-to: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>, Dave Chinner <david@xxxxxxxxxxxxx>, xfs@xxxxxxxxxxx, linux-fsdevel@xxxxxxxxxxxxxxx
References: <20160920163304.GA8999@xxxxxxxxxxxxxxx> <20160920201453.GH340@dastard> <20160920230613.GL340@dastard>
User-agent: Mutt/1.7.0 (2016-08-17)
On Wed, Sep 21, 2016 at 09:06:13AM +1000, Dave Chinner wrote:
> On Wed, Sep 21, 2016 at 06:14:53AM +1000, Dave Chinner wrote:
> > On Tue, Sep 20, 2016 at 10:33:04AM -0600, Ross Zwisler wrote:
> > > I'm consistently able to generate this kernel BUG with both v4.7 and 
> > > v4.8-rc7.
> > > This bug reproduces both with and without DAX.
> > > Here is the BUG with v4.8-rc7, passed through kasan_symbolize.py:
> > > 
> > >   run fstests generic/026 at 2016-09-20 10:22:58
> > >   XFS (pmem0p2): Unmounting Filesystem
> > >   XFS: Assertion failed: tp->t_blk_res_used <= tp->t_blk_res, file: 
> > > fs/xfs/xfs_trans.c, line: 309
> > 
> > It overran the block allocation reservation for the transaction.
> 
> Can you try the patch I've attached below, Ross? it solves the
> problem for me....
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
> 
> xfs: remote attribute blocks aren't really userdata
> 
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> 
> When adding a new remote attribute, we write the attribute to the
> new extent before the allocation transaction is committed. This
> means we cannot reuse busy extents as that violates crash
> consistency semantics. Hence we currently treat remote attribute
> extent allocation like userdata because it has the same overwrite
> ordering constraints as userdata.
> 
> Unfortunately, this also allows the allocator to incorrectly apply
> extent size hints to the remote attribute extent allocation. This
> results in interesting failures, such as transaction block
> reservation overruns and in-memory inode attribute fork corruption.
> 
> To fix this, we need to separate the busy extent reuse configuration
> from the userdata configuration. This changes the definition of
> XFS_BMAPI_METADATA slightly - it now means that allocation is
> metadata and reuse of busy extents is acceptible due to the metadata
> ordering semantics of the journal. If this flag is not set, it
> means the allocation is that has unordered data writeback, and hence
> busy extent reuse is not allowed. It no longer implies the
> allocation is for user data, just that the data write will not be
> strictly ordered. This matches the semantics for both user data
> and remote attribute block allocation.
> 
> As such, This patch changes the "userdata" field to a "datatype"
> field, and adds a "no busy reuse" flag to the field.
> When we detect an unordered data extent allocation, we immediately set
> the no reuse flag. We then set the "user data" flags based on the
> inode fork we are allocating the extent to. Hence we only set
> userdata flags on data fork allocations now and consider attribute
> fork remote extents to be an unordered metadata extent.
> 
> The result is that remote attribute extents now have the expected
> allocation semantics, and the data fork allocation behaviour is
> completely unchanged.
> 
> It should be noted that there may be other ways to fix this (e.g.
> use ordered metadata buffers for the remote attribute extent data
> write) but they are more invasive and difficult to validate both
> from a design and implementation POV. Hence this patch takes the
> simple, obvious route to fixing the problem...
> 
> Reported-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>

Yep, this solves it for me as well.

Tested-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>

<Prev in Thread] Current Thread [Next in Thread>