|To:||Dave Chinner <david@xxxxxxxxxxxxx>|
|Subject:||Re: I/O hang, possibly XFS, possibly general|
|From:||Phil Karn <karn@xxxxxxxxxxxx>|
|Date:||Thu, 2 Jun 2011 19:11:15 -0700|
|Cc:||Paul Anderson <pha@xxxxxxxxx>, Linux fs XFS <xfs@xxxxxxxxxxx>|
|References:||<BANLkTim_BCiKeqi5gY_gXAcmg7JgrgJCxQ@xxxxxxxxxxxxxx> <19943.56524.969126.59978@xxxxxxxxxxxxxxxxxx> <BANLkTim978GhfamN=TEFULP5GdfMu02-7w@xxxxxxxxxxxxxx> <4DE823DD.7060600@xxxxxxxxxxxx> <20110603003907.GW561@dastard>|
On Thu, Jun 2, 2011 at 5:39 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:|
Oh, I'm well aware of delayed allocation. I've just noticed that, in my experience, it doesn't seem to work nearly as well as fallocate(). And why should it? If you know in advance how big a file you're writing, how can it hurt to inform your file system? I suppose the FS implementer could always ignore that information if he felt he could somehow do a better job, but it's hard to see how. Isn't it always better to know than to guess?
I'm talking here about the genuine fallocate() system call, not the POSIX hack that falls back to first conventionally writing zeroes over the file. The true fallocate() call seems very fast, and if your file system doesn't support it then it will simply fail without harm. I still can't see any reason not to use it.
I did know that xfs can avoid the disk allocation and writes entirely when the files are short-lived, but Paul was talking about writing large, long-lived files so that's what I had in mind. And when I use fallocate(), my files are not likely to be short-lived either. Like most people I write the vast majority of my short-lived files to /tmp, which is tmpfs, not xfs.
But you do raise an interesting point -- is there any serious performance degradation from using fallocate() on a short-lived file? The written data still lives in the buffer cache for a while, so if you delete the file before it gets flushed the disk writes will still be avoided. The file system may have a little extra work to undo the unnecessary allocation but that doesn't seem to be a big deal.
Basicaly you are removing one of the major IO optimisation
"Remove" it? How is giving it the correct answer worse than letting it guess -- even if it usually guesses correctly?
I still rely on preallocation to keep log files and mailboxes from getting too badly fragmented.
>So you don't have any idea of how well XFS minimises fragmentation
without needing to use preallocation? Sounds like you have a classic
As I said, I've tried it both ways. I found that the simple act of adding fallocate() to rsync (which I use for practically all copying) vastly reduces xfs fragmentation. Just as I expected it would.
Maybe I'm a little more sensitive to fragmentation than most because I've been experimenting with storing SHA1 hashes of all my files in external attributes. This grew out of a data deduplication tool; at first I simply cached the hashes so I wouldn't have to recompute them on another run, but then I just added them to every file. This lets me get a warm and fuzzy feeling by periodically verifying that my files haven't been corrupted, especially when I began to use SSDs with trim tools.
XFS stores both attributes and extent lists directly in the inode when there's room, and it turns out that a default-sized xfs inode can store my hashes provided that the extent list is small. So I now when I walk through my file system statting everything I can read the hashes too at absolutely no extra cost. This makes deduplication really fast.
I haven't experimented to see how many extents a file can have before the attributes get pushed out of the inode, but by keeping most everything contiguous I simply avoid the problem.
|<Prev in Thread]||Current Thread||[Next in Thread>|
|Previous by Date:||Re: [PATCH v2] xfstests: add support for ext4dev FSTYP, Dave Chinner|
|Next by Date:||Re: I/O hang, possibly XFS, possibly general, Dave Chinner|
|Previous by Thread:||Re: I/O hang, possibly XFS, possibly general, Dave Chinner|
|Next by Thread:||Re: I/O hang, possibly XFS, possibly general, Dave Chinner|
|Indexes:||[Date] [Thread] [Top] [All Lists]|