xfs
[Top] [All Lists]

Re: defrag xfs

To: Steve Lord <lord@xxxxxxx>
Subject: Re: defrag xfs
From: Sonny Rao <sonny@xxxxxxxxxxx>
Date: Fri, 21 Jan 2005 14:05:21 -0500
Cc: linux-xfs@xxxxxxxxxxx
In-reply-to: <41F11609.4020907@xfs.org>
References: <F62740B0EFCFC74AA6DCF52CD746242D010337FA@iu-mssg-mbx05.exchange.iu.edu> <41F07494.1060501@xfs.org> <20050121043237.GA28699@kevlar.burdell.org> <1106286413.8580.66.camel@kennedy> <20050121054830.GA29637@kevlar.burdell.org> <m18y6n1ede.fsf@muc.de> <20050121070221.GA30287@kevlar.burdell.org> <41F11609.4020907@xfs.org>
Sender: linux-xfs-bounce@xxxxxxxxxxx
User-agent: Mutt/1.4.2.1i
On Fri, Jan 21, 2005 at 08:47:37AM -0600, Steve Lord wrote:
> Sonny Rao wrote:
<snip> 
> I did describe how to do this once, but I no longer have that email, so
> I have to recreate.
> 
> 1. Add to the kernel the ability to turn off new allocations to an
>    allocation group. You do need some special under the cover
>    allocations into the group to work though, in freeing up the
>    space in the allocation group, btree splits for the free space
>    may be required, these still need to work in the interim.
> 
> 2. Find all the directories with inodes or blocks in the allocation
>    group - this requires walking all the extents of all the directory
>    inodes..... so not fast. Note that just because an inode is not in
>    the last allocation group does not mean it has no disk blocks there.
> 
> 3. Recreate these directories from user space with a temp name, link all
>    their contents over to the new directory, switch the names of the
>    two inodes atomically inside the kernel, remove the old links and
>    directory. There needs to be logic to detect new files appearing in
>    the old directory, these need to be renamed to the new parent.
> 
>    There are now only file blocks and inodes in the allocation group.
> 
> 4. Repeat the above process with files who's inode or extents are in
>    the allocation group. If just the inode is there (unlikely), then
>    no need to move blocks. xfs_fsr contains most of the logic for this.
> 
> 5. Fix up the superblock counters so that the allocation group count
>    shrinks. Note this could be applied to several allocation
>    groups at once.
> 
> As Andi pointed out, this results in the inode numbers changing, so
> there is no way to do while the filesystem is exported, it also probably
> messes with backups - they would need redoing afterwards.
> 
> There are several months of effort in this to get it all right and
> working robustly.
> 
> Given the low price of storage nowadays, it is a lot cheaper to buy
> another disk than to pay someone to do this. At current rates for
> an experienced xfs developer, you are talking about 120 Gbytes/hour
> at current prices ;-)

I Guess I won't be paying for it anytime soon :)
 
> Now, what would be really neat is for a layer underneath the filesystem
> to dynamically detect failing storage (smart?), take some storage from
> a free pool of drives, and remap the filesystem blocks out to the new
> space while it is live.

Hmm, I would think one might be able to do something like this by
writing an EVMS/LVM2 plugin which communicated with smartd and could
begin a migration to another device when an error is detected.  EVMS
already supports dynamic bad-block-relocation.  In reality it's fairly
useles  since modern drives do this for you anyway and won't report
bad writes until they have actually run out of extra space.  But what
you're proposing makes much more sense.

Thanks for the explanation.

Sonny


<Prev in Thread] Current Thread [Next in Thread>