On Mon, Jul 18, 2011 at 11:11:47PM -0400, Christoph Hellwig wrote:
> On Tue, Jul 19, 2011 at 01:05:51PM +1000, Dave Chinner wrote:
> > On Mon, Jul 18, 2011 at 10:03:17PM -0400, Christoph Hellwig
> > wrote:
> > > Generally looks okay, but doing a context switch in every log
> > > force might bite us. Less the general context switch
> > > overhead, but more the nasty interactions with cfq, which are
> > > causing huge problems for ext3/4,
> > Quite frankly, I don't recommend CFQ unless you need block level
> > throttling or use IO prioritisation seriously. CFQ is way too
> > smart for it's own good trying to do everything for everyone,
> > and as such suffers from different regressions every release.
> > It has weird workload specific heuristics in it to try to
> > address issues that don't solve the general class of problem,
> > and so is always being patched to fix the next occurrence of the
> > same problem. e.g. the IO stalls caused by dependent IOs being
> > issued by different threads that ext3/4 fsync hits all the time.
> I don't like CFQ very much either. But it's the default for both
> mainline Linux and all major distros, so screwing it means a major
> support burden
We never tuned for AS or really cared how it performed when it was
the kernel and major distro default, either. The answer was always
"don't use AS if you care about performance". That's the same advice
major distro's give to their users of XFS w.r.t CFQ, anyway...
> as well as losing all kinds of benchmarks.
Do we really care about benchmarketing? I don't really...
> > I'm of the opinion that anyone with a RAID controller with a BBWC
> > doesn't need the smarts in CFQ because the BBWC provides a much
> > larger and smarter IO re-order window than the Linux IO schedulers
> > and hence do a better job of IO scheduling than Linux can ever do.
> > We shouldn't penalise the target market for XFS for having fast
> > storage by catering to difficiencies of IO schedulers that are
> > mostly redundant for the hardware XFS typically runs on....
> What penatlity do we get for doing the cil force in line from log
> force and only doing it in the background when it needs to be
> written because of filling up the buffers?
I can make the log force code do the push in line, it just
complicates things a little with the need for wrapper functions to
handle the different calling conventions. The log force has to wait
on the workqueue anyway (and will still have to do so even if it
pushes directly itself), so doing the push work directly won't
change the performance there at all.
It's really the background push that I want out of line, so I'll
rework it such that only the background push uses the workqueue.
That should alleviate most of the concerns with fsync+CFQ.