Vanilla 3.0.78

Stefan Priebe - Profihost AG s.priebe at profihost.ag
Mon Jul 29 06:02:45 CDT 2013


Am 29.07.2013 12:01, schrieb Dave Chinner:
> On Mon, Jul 29, 2013 at 10:31:52AM +0200, Stefan Priebe - Profihost AG wrote:
>> Am 29.07.2013 10:22, schrieb Dave Chinner:
>>> On Mon, Jul 29, 2013 at 09:39:37AM +0200, Stefan Priebe - Profihost AG wrote:
>>>> Hi,
>>>>
>>>> while running 3.0.78 and doing heavy rsync tasks on a raid 50 i'm gettig
>>>> these call traces:
>>>
>>> Judging by the timestamps the  problem clears and the system keeps
>>> running?
>>
>> Yes.
>>
>>> If so, the problem is likely to be a combination of contention on a
>>> specific AG for allocation and slow IO. Given it is RAID 50, it's
>>> probably really slow IO, and probably lots of threads wanting the
>>> lock and queuing up on it.
>>>
>>> What's 'iostat -m -x -d 5' look like when these messages are dumped
>>> out?
>>
>> Don't have that but some nagios stats. There were 1000 iop/s and 8MB/s.
> 
> Yup, that sounds like it was doing lots of small random IOs and
> hence was IO bound...
> 
>> But i can reduce the tasks done in parallel if this is the problem.
> 
> Try and find out what the average IO times were when the messages
> are being emitted. If that's up in the seconds, then it's a good
> chance you are simply throwing too many small IOs at your storage.

Thanks!

Stefan



More information about the xfs mailing list