On Mon, Jul 07, 2014 at 09:54:44AM +1000, Dave Chinner wrote:
> On Sat, Jul 05, 2014 at 02:48:07AM -0700, Christoph Hellwig wrote:
> > On Sat, Jul 05, 2014 at 08:22:10AM +1000, Dave Chinner wrote:
> > > I'm open to other ways of fixing this, but right now we've got to
> > > fix xfs_repair because it's currently breaking filesystems worse
> > > than before xfs_repair was run...
> > Ok, so clearly mark this as difference from kernel code in a long
> > comment explaining the situation similar to wrote you above.
> Will do.
Ok, I added this to the top of the libxfs/rdwr.c file:
* Important design/architecture note:
* The userspace code that uses the buffer cache is much less constrained than
* the kernel code. The userspace code is pretty nasty in places, especially
* when it comes to buffer error handling. Very little of the userspace code
* outside libxfs clears bp->b_error - very little code even checks it - so the
* libxfs code is tripping on stale errors left by the userspace code.
* We can't clear errors or zero buffer contents in libxfs_getbuf-* like we do
* in the kernel, because those functions are used by the libxfs_readbuf_*
* functions and hence need to leave the buffers unchanged on cache hits. This
* is actually the only way to gather a write error from a libxfs_writebuf()
* call - you need to get the buffer again so you can check bp->b_error field -
* assuming that the buffer is still in the cache when you check, that is.
* This is very different to the kernel code which does not release buffers on a
* write so we can wait on IO and check errors. The kernel buffer cache also
* guarantees a buffer of a known initial state from xfs_buf_get() even on a
* cache hit.
* IOWs, userspace is behaving quite differently to the kernel and as a result
* it leaks errors from reads, invalidations and writes through
* The result of this is that until the userspace code outside libxfs is cleaned
* up, functions that release buffers from userspace control (i.e
* libxfs_writebuf/libxfs_putbuf) need to zero bp->b_error to prevent
* propagation of stale errors into future buffer operations.
Is that sufficient for the moment?