xfs
[Top] [All Lists]

Re: xfs: add FITRIM support

To: Michael Monnerie <michael.monnerie@xxxxxxxxxxxxxxxxxxx>
Subject: Re: xfs: add FITRIM support
From: Lukas Czerner <lczerner@xxxxxxxxxx>
Date: Thu, 6 Jan 2011 09:33:54 +0100 (CET)
Cc: xfs@xxxxxxxxxxx, Dave Chinner <david@xxxxxxxxxxxxx>, Christoph Hellwig <hch@xxxxxxxxxxxxx>, Lukas Czerner <lczerner@xxxxxxxxxx>
In-reply-to: <201101060910.34534@xxxxxx>
References: <20101125112304.GA4195@xxxxxxxxxxxxx> <201101052307.38379@xxxxxx> <20110105225039.GD8322@dastard> <201101060910.34534@xxxxxx>
User-agent: Alpine 2.00 (LFD 1167 2008-08-23)
On Thu, 6 Jan 2011, Michael Monnerie wrote:

> On Mittwoch, 5. Januar 2011 Dave Chinner wrote:
> > No state or additional on-disk
> > structures are needed for xfs_fsr to do it's work....
> 
> That's not exactly the same - once you defraged a file, you know it's 
> done, and can skip it next time. But you dont know if the (free) space 
> between block 0 and 20 on disk has been rewritten since the last trim 
> run or not used at all, so you'd have to do it all again.
>  
> > The background trim is intended to enable even the slowest of
> > devices to be trimmed over time, while introducing as little runtime
> > overhead and complexity as possible. Hence adding complexity and
> > runtime overhead to optimise background trimming tends to defeat the
> > primary design goal....
> 
> It would be interesting to have real world numbers to see what's "best". 
> I'd imagine a normal file or web server to store tons of files that are 
> mostly read-only, while 5% of it a used a lot, as well as lots of temp 
> files. For this, knowing what's been used would be great.
> 
> Also, I'm thinking of a NetApp storage, that has been setup to run 
> deduplication on Sunday. It's best to run trim on Saturday and it should 
> be finished before Sunday. For big storages that might be not easy to 
> finish, if all disk space has to be freed explicitly.
> 
> And wouldn't it still be cheaper to keep a "written bmap" than to run 
> over the full space of a (big) disk? I'd say depends on the workload.
> 

I have already investigated approach with storing the information about
blocks freed since last trim. However I found it not that useful for
several reasons.

1. Bitmaps are big, especially on huge filesystems you are talking about
it will significantly increase the memory utilization.

2. Rbtree might be better, however there is some threshold we need to
watch, because when it gets really fragmented it can be bigger than
bitmap. Moreover it adds significant complexity and of course CPU
utilization.

3. As I said several times, we do not need to trim when there was not
enough writes from the last trim, because when we have enough space for
example for wear leveling in SSD, we do not need to reclaim more, OR we
can do is really slowly as a precaution measure.

All that said, we have much more flexibility in user space and we can
think of a lots of different heuristic to determine whether or not to do
the trim and how.

Thanks!
-Lukas

<Prev in Thread] Current Thread [Next in Thread>