XFS hang during xfs_fsr run

Dave Chinner david at fromorbit.com
Fri Mar 12 05:56:45 CST 2010


On Fri, Mar 12, 2010 at 05:00:19AM -0500, Christoph Hellwig wrote:
> On Fri, Mar 12, 2010 at 10:45:19AM +0100, Michael Weissenbacher wrote:
> > Hi Dave!
> >> Hi Michael - have you got any idea what the files are that are
> >> hitting this? This failure is implying that the inode is still dirty
> >> after syncing all the data. Is something trying to modify it while
> >> XFS is trying to map it?
> > Yes, as far as i can tell it's always a file that some process is  
> > currently modifying. It happens ofter with some file unter /var/log  
> > which syslog is currently modifying. I tried setting the "no-defrag"  
> > flag via xfs_io's chattr on all log files but that didn't seem to help.  
> > It seems that cyrus imapd is triggering this problem far more likely  
> > than any other program. Some examples of files where it usually hangs:
> > /var/spool/imap/x/user/xxxx/cyrus.cache (lsof -> cyrus)
> > /var/imap/db/log.xxxxxxx (lsof -> cyrus)
> > /var/log/xxx.log (lsof -> syslog)
> 
> So what's interesting is that cyrus uses mmapp access to files, which
> might be an indicator that we have problems with excluding fsr on mmaped
> files.

Ah, yeah.

->page_mkwrite executes without the inode iolock held, so we can't
lock it out from creating new delalloc pages by holding the iolock
like the bmap code does.

I don't think we're allowed to take the iolock in ->page_mkwrite, so
effectively that leaves us with the situation where we can't do an
atomic flush and map in the bmap code.

Christoph, I guess that means we need to make the bmap code
handle/ignore delalloc extents rather than assume they never occur
after the flush.  What do you think?

Cheers,

Dave.
-- 
Dave Chinner
david at fromorbit.com




More information about the xfs mailing list