xfs
[Top] [All Lists]

Re: raid10n2/xfs setup guidance on write-cache/barrier

To: Linux RAID <linux-raid@xxxxxxxxxxxxxxx>, Linux fs XFS <xfs@xxxxxxxxxxx>
Subject: Re: raid10n2/xfs setup guidance on write-cache/barrier
From: pg@xxxxxxxxxxxxxxxxxxx (Peter Grandi)
Date: Sun, 18 Mar 2012 19:17:16 +0000
In-reply-to: <20120318140051.GA2594@xxxxxxxxxxxxx>
References: <CAA8mOyDKrWg0QUEHxcD4ocXXD42nJu0TG+sXjC4j2RsigHTcmw@xxxxxxxxxxxxxx> <4F61803A.60009@xxxxxxxxxxxxxxxxx> <CAA8mOyCzs36YD_QUMq25HQf8zuq1=tmSTPjYdoFJwy2Oq9sLmw@xxxxxxxxxxxxxx> <20321.63389.586851.689070@xxxxxxxxxxxxxxxxxx> <4F64115D.50208@xxxxxxxxxxxxxxxxx> <20120317223454.GQ5091@dastard> <20325.17387.216150.45013@xxxxxxxxxxxxxxxxxx> <20325.50714.237894.328039@xxxxxxxxxxxxxxxxxx> <20120318140051.GA2594@xxxxxxxxxxxxx>
[ ... ]

>>>  «There have been decent but no major improvements in XFS
>>>  metadata *performance*, but weaker implicit *semantics*
>>>  have been made an option, and these have a different
>>>  safety/performance tradeoff (less implicit safety, somewhat
>>>  more performance), not "just" better performance.»

>> I have left implicit a point that perhaps should be explicit: I
>> think that XFS metadata performance before 'delaylog' was pretty
>> good, and that it has remained pretty good with 'delaylog'.

> For many workloads it absolutely wasn't.

My self importance is not quite as huge as feeling able to just
say «absolutely wasn't» to settle points once and for all.

So I would rather argue (and I did in a different form) that for
some workloads 'nobarrier'+'hdparm -W1' or 'eatmydata' have the
most desirable tradeoffs, and for many others the safety/speed
tradeoff of 'delaylog' is more appropriate (so for example I
think that making it the default is reasonable if a bit edgy).

But also that as the already quoted document makes it very clear
how overall 'delaylog' improves unsafety and only thanks to this
latency and time to completion are better:

http://lwn.net/Articles/476267/
http://www.mjmwired.net/kernel/Documentation/filesystems/xfs-delayed-logging-design.txt
124     [ ... ] In other
125     words, instead of there only being a maximum of 2MB of transaction 
changes not
126     written to the log at any point in time, there may be a much greater 
amount
127     being accumulated in memory. Hence the potential for loss of metadata 
on a
128     crash is much greater than for the existing logging mechanism.

That's why my argument was that performance without 'delaylog'
was good: given the safer semantics, it was quite good.

Just perhaps not the semantics tradeoff that some people wanted
in some cases, and I think that it is cheeky marketing to
describe something involving a much greater «potential for loss
of metadata» as better performance boldly and barely, as then
one could argue that 'eatmydata' gives the best "performance".

Note: the work on multithreading the journaling path is an
  authentic (and I guess amazingly tricky) performance
  improvement instead, not merely a new safety/latency/speed
  tradeoff similar to 'nobarrier' or 'eatmydata'.

>> People who complained about slow metadata performance with XFS
>> before 'delaylog' were in effect complaining that XFS was
>> implementing overly (in some sense) safe metadata semantics, and
>> in effect were demanding less (implicit) safety, without
>> probably realizing they were asking for that.

> No, they weren't,

Again my self importance is not quite as huge as feeling able to
just say «No, they weren't» to settle points once and for all.

Here it is not clear to me what you mean by «they weren't», but
as the quote above shows, even if complainers weren't in effect
«demanding less (implicit) safety», that's what they got anyhow,
because that's the main (unavoidable) way to improve latency
massively given how expensive barriers are (at least on disk
devices). That's how the O_PONIES story goes...

> [ ... personal attacks ... ]

It is noticeable that 90% of your post is pure malicious
offtopic personal attack, and the rest is "from on high",
and the whole is entirely devoid of technical content.

<Prev in Thread] Current Thread [Next in Thread>