xfs
[Top] [All Lists]

Re: No local regular files

To: Shailendra Tripathi <stripathi@xxxxxxxxx>
Subject: Re: No local regular files
From: David Chinner <dgc@xxxxxxx>
Date: Mon, 30 Jan 2006 09:06:27 +1100
Cc: Chris Wedgwood <cw@xxxxxxxx>, linux-xfs@xxxxxxxxxxx
In-reply-to: <43D6BE03.1020901@xxxxxxxxx>
References: <Pine.LNX.4.64.0601231054140.11064@xxxxxxxxxxxxxxx> <43D645CD.3030102@xxxxxxxxx> <20060124170154.GA11338@xxxxxxxxxxxxxxxxxxxxx> <43D6BE03.1020901@xxxxxxxxx>
Sender: linux-xfs-bounce@xxxxxxxxxxx
User-agent: Mutt/1.4.2.1i
On Wed, Jan 25, 2006 at 05:23:39AM +0530, Shailendra Tripathi wrote:
> Chris Wedgwood wrote:
> 
> >Only for metadata, for regular files I think only reiserfs does this.
> >Doing it probably makes more complicated especially for
> >writes/flushes.
> >
> > I think reiserfs unpacks these files when opened in
> >write for this reason (and optionally packs them again then the file
> >is closed).
> >
> > 
> >
> It should not be that tricky for XFS though.

I can think of several tricky problems that need to be solved before
this could work.

When do you convert from in line to out of line data? During delayed
allocation when flushing the page? i.e. how do you propose to keep
the page cache and the pagebuf coherent as we'd be caching the same
data in two different places now, both with different locking, life
cycles and flushing strategies? 

How do you handle the data in caches being remapped to different disk
blocks when it gets moved to a different location?

How do you represent the inline data via xfs_bmapi and friends?  Is
the inline data an extent?  How do you represent the data offset on
disk when it's not a multiple of the filesystem block size?

You'll need a new transaction for converting in-line to out of line
format (and vice versa) so that a crash between removing the data
form the inode and writing the new extents won't lose the data in
the inode....

Also, consider tail pushing the AIL. If there's data in the inode,
your tail push now has to pull data out of the page cache for each
inode in the cluster that stale data is not written back. I can see
potential for some nasty inode locking problems there.

That's just off the top of my head - I'm sure there's more..... ;)

Cheers,

Dave.
-- 
Dave Chinner
R&D Software Enginner
SGI Australian Software Group


<Prev in Thread] Current Thread [Next in Thread>