[Top] [All Lists]

Re: [FAQ] XFS speculative preallocation

To: Arkadiusz MiÅkiewicz <arekm@xxxxxxxx>
Subject: Re: [FAQ] XFS speculative preallocation
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Sat, 22 Mar 2014 10:16:17 +1100
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <201403211809.03683.arekm@xxxxxxxx>
References: <20140321162920.GA3087@xxxxxxxxxxxxxx> <201403211809.03683.arekm@xxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Fri, Mar 21, 2014 at 06:09:03PM +0100, Arkadiusz MiÅkiewicz wrote:
> On Friday 21 of March 2014, Brian Foster wrote:
> > Hi all,
> > 
> > Eric had suggested we add an FAQ entry for speculative preallocation
> > since it seems to be a common question, so I offered to write something
> > up. I started with a single entry but split it into a couple Q's when it
> > turned into TL;DR fodder. ;)
> > 
> > The text is embedded below for review. Thoughts on the questions or
> > content is appreciated. Also, once folks are Ok with this... how does
> > one gain edit access to the wiki?
> More questions or topics that can be converted to questions from me:
> 1) Before preallocation kernel did things differently. AFAIK it wasn't the 
> same as allocsize=64k, was it? Is there a way to get old behaviour or 
> something similar to old behaviour?

The old behaviour is exactly that of allocsize=64k.

> > modified to not interfere with ongoing
> > writes.
> In case of some app that constantly writes to files (apache web server 
> writting to its logs for example) that background trimming will never do 
> anything for these files, right?

If the inode is being constantly dirtied, then the speculative
prealloc will not be removed by the background scanner. It only
removes prealloc from clean inodes.

> > A 5 minute scan interval is used by default and can be adjusted
> > via the following file (value in seconds):
> > 
> >     /proc/sys/fs/xfs/speculative_prealloc_lifetime
> > 
> > Although speculative preallocation can lead to reports of excess space
> > usage, the preallocated space is not permanent unless explicitly made so
> > via fallocate or a similar interface. Preallocated space can also be
> > encoded permanently in situations where file size is extended beyond a
> > range of post-EOF blocks (i.e., via truncate). Otherwise, preallocated
> > blocks are reclaimed on file close, inode reclaim, unmount or in the
> > background once file write activity subsides.
> So there is no mechanism that would shirnk preallocations in case when free 
> space is (almost or) gone on a fs?

Background space trimmer takes care of that. We could probably also
trigger it on ENOSPC, but once you are already at ENOSPC it's too

> Case: apache causes xfs to preallocate 
> several GB for its /var/..../{access,error}_log (common problem here) and 
> then 
> free space ends on that fs causing problems for every app that writes to /var.

Your log files would have to already be GB in size for that your
apache logs to preallocate that much. If your log files are that
big, then /var needs to be much, much larger than what the
speculative prealloc for a handful of files could easily exhaust.


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>