[Top] [All Lists]

Re: [PATCH 4/6] libxfs: reused invalidated buffers leak state and data

To: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Subject: Re: [PATCH 4/6] libxfs: reused invalidated buffers leak state and data
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Mon, 7 Jul 2014 10:09:29 +1000
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140706235444.GP9508@dastard>
References: <1404453435-1915-1-git-send-email-david@xxxxxxxxxxxxx> <1404453435-1915-5-git-send-email-david@xxxxxxxxxxxxx> <20140704141509.GB29520@xxxxxxxxxxxxx> <20140704222210.GM9508@dastard> <20140705094807.GB18130@xxxxxxxxxxxxx> <20140706235444.GP9508@dastard>
User-agent: Mutt/1.5.21 (2010-09-15)
On Mon, Jul 07, 2014 at 09:54:44AM +1000, Dave Chinner wrote:
> On Sat, Jul 05, 2014 at 02:48:07AM -0700, Christoph Hellwig wrote:
> > On Sat, Jul 05, 2014 at 08:22:10AM +1000, Dave Chinner wrote:
> > > I'm open to other ways of fixing this, but right now we've got to
> > > fix xfs_repair because it's currently breaking filesystems worse
> > > than before xfs_repair was run...
> > 
> > Ok, so clearly mark this as difference from kernel code in a long
> > comment explaining the situation similar to wrote you above. 
> Will do.

Ok, I added this to the top of the libxfs/rdwr.c file:

 * Important design/architecture note:
 * The userspace code that uses the buffer cache is much less constrained than
 * the kernel code. The userspace code is pretty nasty in places, especially
 * when it comes to buffer error handling.  Very little of the userspace code
 * outside libxfs clears bp->b_error - very little code even checks it - so the
 * libxfs code is tripping on stale errors left by the userspace code.
 * We can't clear errors or zero buffer contents in libxfs_getbuf-* like we do
 * in the kernel, because those functions are used by the libxfs_readbuf_*
 * functions and hence need to leave the buffers unchanged on cache hits. This
 * is actually the only way to gather a write error from a libxfs_writebuf()
 * call - you need to get the buffer again so you can check bp->b_error field -
 * assuming that the buffer is still in the cache when you check, that is.
 * This is very different to the kernel code which does not release buffers on a
 * write so we can wait on IO and check errors. The kernel buffer cache also
 * guarantees a buffer of a known initial state from xfs_buf_get() even on a
 * cache hit.
 * IOWs, userspace is behaving quite differently to the kernel and as a result
 * it leaks errors from reads, invalidations and writes through
 * libxfs_getbuf/libxfs_readbuf.
 * The result of this is that until the userspace code outside libxfs is cleaned
 * up, functions that release buffers from userspace control (i.e
 * libxfs_writebuf/libxfs_putbuf) need to zero bp->b_error to prevent
 * propagation of stale errors into future buffer operations.

Is that sufficient for the moment?

Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>