xfs
[Top] [All Lists]

Re: [PATCH 03/18] xfs: Do background CIL flushes via a workqueue

To: Ben Myers <bpm@xxxxxxx>
Subject: Re: [PATCH 03/18] xfs: Do background CIL flushes via a workqueue
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Mon, 30 Apr 2012 16:09:13 +1000
Cc: xfs@xxxxxxxxxxx
In-reply-to: <20120430012432.GE3283@dastard>
References: <1334319061-12968-1-git-send-email-david@xxxxxxxxxxxxx> <1334319061-12968-4-git-send-email-david@xxxxxxxxxxxxx> <20120417212155.GA16881@xxxxxxx> <20120417214956.GB16881@xxxxxxx> <20120418014706.GM6734@dastard> <20120430012432.GE3283@dastard>
User-agent: Mutt/1.5.21 (2010-09-15)
On Mon, Apr 30, 2012 at 11:24:32AM +1000, Dave Chinner wrote:
> On Wed, Apr 18, 2012 at 11:47:06AM +1000, Dave Chinner wrote:
> > On Tue, Apr 17, 2012 at 04:49:56PM -0500, Ben Myers wrote:
> > > On Tue, Apr 17, 2012 at 04:21:55PM -0500, Ben Myers wrote:
> > > > Gah! I just hit this assert.  
> > > > 
> > > > v3.4-rc2-3-g8a00ebe with:
> > > > Christoph's ilock series
> > > > Christoph's xfsbufd series
> > > > Jan's freeze series
> > > > Dave's queue.
> > > > 
> > > > nfs7 login: [ 1175.172406] XFS: Assertion failed: push_seq > 0 && 
> > > > push_seq <= ctx->sequence, file: /root/xfs/fs/xfs/xfs_log_cil.c, line: 
> > > > 406
> > 
> > which has probably resulted in a log corruption.
> > 
> > > [  390.097633] ---[ end trace 33a9795b638157b9 ]---
> > > 
> > > Here it is from the system log
> > > 
> > > Apr 17 16:35:12 linux kernel: [  389.637543] XFS (264=211361P۔300): 
> > > Mounting Filesystem
> > > Apr 17 16:35:12 linux kernel: [  389.912402] XFS (264=211361P۔300): 
> > > Starting recovery (logdev: internal)
> > > Apr 17 16:35:12 linux kernel: [  390.054357] XFS (264=211361P۔300): 
> > > xlog_recover_inode_pass2: Bad inode magic number, dip = 0xf0ffd800, dino 
> > > bp = 0xef89c480, ino = 25541592
> > > Apr 17 16:35:12 linux kernel: [  390.066290] XFS (264=211361P۔300): 
> > > Internal error xlog_recover_inode_pass2(1) at line 2248 of file 
> > > /root/xfs/fs/xfs/xfs_log_recover.c.  Caller 0xfaa08ffa
> > 
> > Because log recovery has read a bad inode from disk, which means an
> > allocation transaction has probably not been replayed.
> 
> Just as a note - I'm getting itest 121 reliably tripping over this
> exact problem as a result of the discontiguous buffer item support
> patch. Basically the problem is that an inode modification
> transaction is being replayed before the inode buffer allocation
> transaction, so the read of the inode buffer returns an
> uninitialised data area, and hence the assert failure.
> 
> At this point I can't see why that patch would cause the problem to
> occur and this tends to imply that it isn't that patch that has
> caused the bug.

The bug was in my code - the calculation for the map size was wrong
so only the first half of the modifications to the buffer was being
replayed. So, this seems unrelated.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>