[Top] [All Lists]

Re: XFS hang during xfs_fsr run

To: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Subject: Re: XFS hang during xfs_fsr run
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Fri, 12 Mar 2010 22:56:45 +1100
Cc: Michael Weissenbacher <mw@xxxxxxxxxxxx>, Eric Sandeen <sandeen@xxxxxxxxxxx>, xfs@xxxxxxxxxxx
In-reply-to: <20100312100019.GA13230@xxxxxxxxxxxxx>
References: <4B8FC1B7.3070505@xxxxxxxxxxxx> <20100304222611.GK14317@xxxxxxxxxxxxxxxx> <4B92C71C.5010003@xxxxxxxxxxxx> <20100308000601.GF28189@xxxxxxxxxxxxxxxx> <4B94EADD.2080108@xxxxxxxxxxxx> <4B953D3F.3090002@xxxxxxxxxxx> <4B975C5C.5090806@xxxxxxxxxxxx> <20100311233934.GB4732@dastard> <4B9A0D2F.30506@xxxxxxxxxxxx> <20100312100019.GA13230@xxxxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
On Fri, Mar 12, 2010 at 05:00:19AM -0500, Christoph Hellwig wrote:
> On Fri, Mar 12, 2010 at 10:45:19AM +0100, Michael Weissenbacher wrote:
> > Hi Dave!
> >> Hi Michael - have you got any idea what the files are that are
> >> hitting this? This failure is implying that the inode is still dirty
> >> after syncing all the data. Is something trying to modify it while
> >> XFS is trying to map it?
> > Yes, as far as i can tell it's always a file that some process is  
> > currently modifying. It happens ofter with some file unter /var/log  
> > which syslog is currently modifying. I tried setting the "no-defrag"  
> > flag via xfs_io's chattr on all log files but that didn't seem to help.  
> > It seems that cyrus imapd is triggering this problem far more likely  
> > than any other program. Some examples of files where it usually hangs:
> > /var/spool/imap/x/user/xxxx/cyrus.cache (lsof -> cyrus)
> > /var/imap/db/log.xxxxxxx (lsof -> cyrus)
> > /var/log/xxx.log (lsof -> syslog)
> So what's interesting is that cyrus uses mmapp access to files, which
> might be an indicator that we have problems with excluding fsr on mmaped
> files.

Ah, yeah.

->page_mkwrite executes without the inode iolock held, so we can't
lock it out from creating new delalloc pages by holding the iolock
like the bmap code does.

I don't think we're allowed to take the iolock in ->page_mkwrite, so
effectively that leaves us with the situation where we can't do an
atomic flush and map in the bmap code.

Christoph, I guess that means we need to make the bmap code
handle/ignore delalloc extents rather than assume they never occur
after the flush.  What do you think?


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>