On Mon, Nov 08, 2010 at 07:26:00PM -0500, Christoph Hellwig wrote:
> Since the move to the new truncate sequence we call xfs_setattr to
> truncate down excessively instanciated blocks. As shown by the testcase
> in kernel.org BZ #22452 that doesn't work too well. Due to the confusion
> of the internal inode size, and the VFS inode i_size it zeroes data that
> it shouldn't.
> But full blown truncate seems like overkill here. We only instanciate
> delayed allocations in the write path, and given that we never released
> the iolock we can't have converted them to real allocations yet either.
> The only nasty case is pre-existing preallocation which we need to skip.
> The patch below does that by borrowing code from xfs_aops_discard_page.
> It does pass xfstests for 4k block filesystems and fixes the original
> bug. I'm not quite sure if we could hit a corner case with smaller
> block sizes when parts of a page are preallocated and some not.
Seems likely - preallocated block past EOF are not unusual.
> could be handled by looping around bmapi as long as we find extents
> for our range. The path could probably also be refactored to share
> code with xfs_aops_discard_page. And we probably need the ilock
> just as in that path, but I only got to that when almost through
> xfstests, and the day is over for me today, so let's just get the
> patch out for now.
Yes, definitely need the ilock - that's the only lock that provides
protection for the extent tree.
It looks to me that we need a general "discard delalloc blocks from
range" function - I'll write one (basically the guts of
xfs_aops_discard_page) and convert xfs_aops_discard_page() and this
code to use it....