xfs
[Top] [All Lists]

Re: [PATCH] mm: Fix XFS oops due to dirty pages without buffers on s390

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH] mm: Fix XFS oops due to dirty pages without buffers on s390
From: Martin Schwidefsky <schwidefsky@xxxxxxxxxx>
Date: Thu, 11 Oct 2012 09:44:05 +0200
Cc: Hugh Dickins <hughd@xxxxxxxxxx>, Jan Kara <jack@xxxxxxx>, linux-mm@xxxxxxxxx, LKML <linux-kernel@xxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx, Mel Gorman <mgorman@xxxxxxx>, linux-s390@xxxxxxxxxxxxxxx
In-reply-to: <20121010215600.GX23644@dastard>
Organization: IBM Corporation
References: <1349108796-32161-1-git-send-email-jack@xxxxxxx> <alpine.LSU.2.00.1210082029190.2237@xxxxxxxxxxxx> <20121009162107.GE15790@xxxxxxxxxxxxx> <alpine.LSU.2.00.1210091824390.30802@xxxxxxxxxxxx> <20121010215600.GX23644@dastard>
On Thu, 11 Oct 2012 08:56:00 +1100
Dave Chinner <david@xxxxxxxxxxxxx> wrote:

> On Tue, Oct 09, 2012 at 07:19:09PM -0700, Hugh Dickins wrote:
> > On Tue, 9 Oct 2012, Jan Kara wrote:
> > > On Mon 08-10-12 21:24:40, Hugh Dickins wrote:
> > > > On Mon, 1 Oct 2012, Jan Kara wrote:
> > > > 
> > > > > On s390 any write to a page (even from kernel itself) sets 
> > > > > architecture
> > > > > specific page dirty bit. Thus when a page is written to via standard 
> > > > > write, HW
> > > > > dirty bit gets set and when we later map and unmap the page, 
> > > > > page_remove_rmap()
> > > > > finds the dirty bit and calls set_page_dirty().
> > > > > 
> > > > > Dirtying of a page which shouldn't be dirty can cause all sorts of 
> > > > > problems to
> > > > > filesystems. The bug we observed in practice is that buffers from the 
> > > > > page get
> > > > > freed, so when the page gets later marked as dirty and writeback 
> > > > > writes it, XFS
> > > > > crashes due to an assertion BUG_ON(!PagePrivate(page)) in 
> > > > > page_buffers() called
> > > > > from xfs_count_page_state().
> > > > 
> > > > What changed recently?  Was XFS hardly used on s390 until now?
> > >   The problem was originally hit on SLE11-SP2 which is 3.0 based after
> > > migration of our s390 build machines from SLE11-SP1 (2.6.32 based). I 
> > > think
> > > XFS just started to be more peevish about what pages it gets between these
> > > two releases ;) (e.g. ext3 or ext4 just says "oh, well" and fixes things
> > > up).
> > 
> > Right, in 2.6.32 xfs_vm_writepage() had a !page_has_buffers(page) case,
> > whereas by 3.0 that had become ASSERT(page_has_buffers(page)), with the
> > ASSERT usually compiled out, stumbling later in page_buffers() as you say.
> 
> What that says is that no-one is running xfstests-based QA on s390
> with CONFIG_XFS_DEBUG enabled, otherwise this would have been found.
> I've never tested XFS on s390 before, and I doubt any of the
> upstream developers have, either, because not many peopl ehave s390
> machines in their basement. So this is probably just an oversight
> in the distro QA environment more than anything....

Our internal builds indeed have CONFIG_XFS_DEBUG=n, I'll change that and
watch for the fallout.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

<Prev in Thread] Current Thread [Next in Thread>