On Tue, Aug 09, 2011 at 12:10:48PM +0200, Michael Monnerie
> First of all, please calm down. Getting personal is not bringing us
Well, it's not me who's getting personal, so...?
> > Logic error - if I can corrupt an XFS without special privileges then
> > this is not a problem with xfs_fsr, but simply a kernel bug in the
> > xfs code. And a rather big one, one step below a remote exploit.
> No, it's not a kernel bug because as long as you don't use xfs_fsr,
> nothing will ever happen.
"As long as you don't boot, it will not crash".
xfs_fsr uses syscalls, just like other applications. According to your
(wrong) logic, if an application uses chown and this causes a kernel oops,
this is also not a kernel bug.
Thats of course wrong - it's the kernel that crashes when an applicaiton
does certain access patterns.
> and sometimes also
As has been reported on this list, this option is really harmful on
current xfs - in my case, it lead to xfs causing ENOSPC even when the disk
was 40% empty (~188gb).
> and I can't find evidence for fragmentation that would be harmful.Yes
Well, define "harmful" - slow logfile reads aren't what I consider
"harmful" either. It's just very very slow.
> The allocsize option helps a lot there. I looked at one webserver access
> log, it has 640MB with 99 fragments, but that's not a lot. On our
> Spamgate I see 250MB logs with 374 fragments.
Well, if it were one fragment, you could read that in 4-5 seconds, at 374
fragments, it's probably around 6-7 seconds. Thats not harmful, but if you
extrapolate this to a few gigabytes and a lot of files, it becomes quite
> don't use the allocsize option there, which I changed now that I looked
That allocsize option is no longer reasonable with newer kernels, as the
kernel will reserve 64m diskspace even for 1kb files indefinitely.
> > If XFS is bad at append-only workloads, which is the most common type
> > of workload, then XFS fails to be very relevant for the real world.
> may be valid for your world, not mine. We have webservers, fileservers
> and database servers, all of which are not really append style, but more
If you find a way of recreating files without appending to them, let me
The problem with fragmentatioon is that it happens even for a few writers
for "create file" workloads (which do append...).
You probably make a distinction between "writing a file fast" and "writing
a file slow", but the distinction is not a qualitative difference. On busy
servers thta create a lot of files, you get fragmentation the same way
as on less busy servers that write files slower. There is little to no
difference in the resulting patterns.
> Well, db-servers are rather exceptional here.
Yes, append style is what makes up for the vast majority of disk writes on
a normal system, db-servers excepted indeed.
> But if the numbers for fragmentation on your servers are true, you must
> have a very good test case for fragmentation prevention. Therefore it
> could be really interesting if you could grab what Dave Chinner asked
I'll keep it in mind.
> And maybe he could use it for optimizations. Is there any tool on Linux
> to record such I/O patterns?
I presume strace would do, but thats where the "lot of work" comes in. If
there is a ready-to-use tool, that would of course make it easy.
The choice of a Deliantra, the free code+content MORPG
-----==- _GNU_ http://www.deliantra.net
----==-- _ generation
---==---(_)__ __ ____ __ Marc Lehmann
--==---/ / _ \/ // /\ \/ / schmorp@xxxxxxxxxx