fallocate bug?

Zhu Han schumi.han at gmail.com
Tue May 8 00:10:55 CDT 2012


On Tue, May 8, 2012 at 12:40 PM, Dave Chinner <david at fromorbit.com> wrote:

> On Tue, May 08, 2012 at 11:24:52AM +0800, Zhu Han wrote:
> > On Tue, May 8, 2012 at 7:59 AM, Dave Chinner <david at fromorbit.com>
> wrote:
> >
> > > On Mon, May 07, 2012 at 08:44:17PM +0800, Zhu Han wrote:
> > > > Seems like xfs of CentOS 6.X occupies much more storage space than
> > > desired
> > > > if fallocate is used against the file. Here is the step to reproduce
> it:
> > >
> > > You test case is not doing what you think it is doing.
> >
> > Thanks for pointing it out.
> >
> > > > By the way, is it normal when the file is moved around after the
> > > > preallocated region is filled with data?
> > > >
> > > > $ uname -r
> > > > 2.6.32-220.7.1.el6.x86_64
> > > >
> > > > $fallocate -n --offset 0 -l 1G file    ---->Write a little more data
> than
> > > > the preallocated size
> > > >
> > > > $ xfs_bmap -p -vv file
> > > > file:
> > > >  EXT: FILE-OFFSET      BLOCK-RANGE            AG AG-OFFSET
> > > > TOTAL FLAGS
> > > >    0: [0..2097151]:    2593408088..2595505239 21 (29420144..31517295)
> > > > 2097152 10000
> > > >
> > > > $ dd if=/dev/zero of=/tmp/file bs=1M count=1026 conv=fsync
> > >
> > > That does a truncate first, removing all the preallocated space. Use
> > > conv=notrunc to avoid this. Hence the space allocated by this
> > > new write is different to the space allocated by the above
> > > preallocation. The file has not been moved, the filesystem just did
> > > what you asked it to do.
> > >
> > > >
> > > > $ xfs_bmap -p -vv file
> > > > file:
> > > >  EXT: FILE-OFFSET      BLOCK-RANGE            AG AG-OFFSET
> > > > TOTAL FLAGS
> > > >    0: [0..4194303]:    2709184016..2713378319 22 (23101408..27295711)
> > > > 4194304 00000
> > >
> > > And so now you've triggered the speculative delayed allocation
> > > beyond EOF, which is normal behaviour. Hence there are currently
> > > unused blocks beyond EOF which will get removed either when the next
> > > close(fd) occurs on the file or the inode is removed from the cache.
> > >
> >
> > Close(fd) should be invoked before dd quits. But why the extra blocks
> > beyond EOF are not freed?
>
> The removal is conditional on how many times the fd has been closed
> with dirty data on the inode.
>
> > The only way I found to remove the extra blocks is truncate the file to
> its
> > real size.
>
> If the close() didn't remove them, they will be removed when the
> inode ages out of the cache. Why do you even care about them?
>

Our distributed system depends on the real length of files to account the
space usage. This behavior make the account inaccurate.


>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david at fromorbit.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20120508/ca3f83c3/attachment.htm>


More information about the xfs mailing list