Re: fs corruption exposed by "xfs: increase prealloc size to double that

To: Brian Foster <bfoster@xxxxxxxxxx>
Subject: Re: fs corruption exposed by "xfs: increase prealloc size to double that of the previous extent"
From: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Date: Sun, 16 Mar 2014 02:39:31 +0000
Cc: xfs@xxxxxxxxxxx, Dave Chinner <dchinner@xxxxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140316022105.GQ18016@xxxxxxxxxxxxxxxxxx>
References: <20140315210216.GP18016@xxxxxxxxxxxxxxxxxx> <20140316022105.GQ18016@xxxxxxxxxxxxxxxxxx>
Sender: Al Viro <viro@xxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Sun, Mar 16, 2014 at 02:21:05AM +0000, Al Viro wrote:
> On Sat, Mar 15, 2014 at 09:02:16PM +0000, Al Viro wrote:
> > And that's essentially what makes generic/263 complain.  Note, BTW, that
> > fallocate and hole-punching is irrelevant - test in generic/263 steps into
> > those, but the same thing happens with these operations disabled (by -F -H).
> > 
> > I've found the thread from last June where you've mentioned generic/263
> > regression; AFAICS, Dave's comments there had been wrong...
> BTW, experimenting with that thing shows that junk in the tail of the page
> actually comes from some unused sectors on the same device.  So it's an
> information leak at the very least - I have seen it pick bits and pieces of
> previously removed files that way.

Hrm...  s/unused/not zeroed out/, actually - block size is 4K.  So we have
an empty file extended by ftruncate(), then mmap+msync+munmap in its tail,
then O_DIRECT write starting from a couple of blocks prior to EOF and
extending it by ~15 blocks.  New EOF is 2.5Kb off the beginning of the
(new) last block.  Then it's closed.  Remaining 1.5Kb of that last
block is _not_ zeroed out; moreover, pagefault on that page ends up
reading the entire block, the junk in the tail not getting zeroed out
in in-core copy either.  Interesting...

