[Top] [All Lists]

Re: XFS: Abysmal write performance because of excessive seeking (allocat

To: Linux fs XFS <xfs@xxxxxxxxxxx>
Subject: Re: XFS: Abysmal write performance because of excessive seeking (allocation groups to blame?)
From: pg_xf2@xxxxxxxxxxxxxxxxxx (Peter Grandi)
Date: Sat, 7 Apr 2012 20:11:36 +0100
In-reply-to: <CAAxjCEwYVB4T2EHU6JWi5Oxf=hepoKjVVj+dVAYWo_t4KNFeyg@xxxxxxxxxxxxxx>
References: <CAAxjCEwBMbd0x7WQmFELM8JyFu6Kv_b+KDe3XFqJE6shfSAfyQ@xxxxxxxxxxxxxx> <20350.9643.379841.771496@xxxxxxxxxxxxxxxxxx> <CAAxjCEwYVB4T2EHU6JWi5Oxf=hepoKjVVj+dVAYWo_t4KNFeyg@xxxxxxxxxxxxxx>
> [ ... ] I also tried different Linux IO elevators, as you
> suggested in your other response, without any measurable
> effect. [ ... ]

That's probably because of the RAID6 host adapter being
uncooperative, but I wondered whether this might apply in some

 «As of kernel 3.2.12, the default i/o scheduler, CFQ, will
  defeat much of the parallelization in XFS.»

BTW earlier 'cfq' versions have been reported to have huge
problems with workloads involving writes and reads, and only
'deadline' (which is quite unsuitable for some workloads) seems
to be fairly reliable.


«Anyway, we have found HUGE problems with CFQ in many different
 scenarios and many different hardware setups. If it was only an
 issue with our configuration I would have foregone posting this
 message and simply informed those kernel developers responsible
 for the fix.

 Two scenarios where CFQ has a severe problem - When you are
 running a single block device (1 drive, or a raid 1 scenario)
 under certain circumstances where heavy sustained writes are
 occurring the CFQ scheduler will behave very strangely. It will
 begin to give all access to reads and limit all writes to the
 point of allowing only 0-2 I/O write operations being allowed
 per second vs 100-180 read operations per second. This condition
 will persist indefinitely until the sustained write process
 completes. This is VERY bad for a shared environment where you
 need reads and writes to complete regardless of increased reads
 or writes. This behavior goes beyond what CFQ says it is
 supposed to do in this situation - meaning this is a bug, and a
 serious one at that. We can reproduce this EVERY TIME.

 The second scenario occurs when you have two or more block
 devices, either single drives, or any type of raid array
 including raid 0,1,0+1,1+0,5 and 6. (We never tested 3,4 who
 uses raid 3 or 4 anymore anyway?!!). This case is almost exactly
 opposite of what happens with only one block device. In this
 case if one of more of the drives is blocked with heavy writes
 for a sustained period of time CFQ will block reads from the
 other devices or severely limit the reads until the writes have
 completed. We can also reproduce this behavior with test
 software we have written on a 100% consistent basis.»

<Prev in Thread] Current Thread [Next in Thread>