[Top] [All Lists]

Re: I/O hang, possibly XFS, possibly general

To: xfs@xxxxxxxxxxx
Subject: Re: I/O hang, possibly XFS, possibly general
From: Michael Monnerie <michael.monnerie@xxxxxxxxxxxxxxxxxxx>
Date: Wed, 8 Jun 2011 10:32:58 +0200
Cc: Peter Grandi <pg_mh@xxxxxxxxxx>
In-reply-to: <19950.12549.541440.285348@xxxxxxxxxxxxxxxxxx>
Organization: it-management http://it-management.at
References: <BANLkTim_BCiKeqi5gY_gXAcmg7JgrgJCxQ@xxxxxxxxxxxxxx> <201106060929.06814@xxxxxx> <19950.12549.541440.285348@xxxxxxxxxxxxxxxxxx>
User-agent: KMail/1.13.6 (Linux/; KDE/4.6.0; x86_64; ; )
On Dienstag, 7. Juni 2011 Peter Grandi wrote:
>   * A file that is written out at speed, say 100-500MB/s. 2-4s
>     means that there is an opportunity to allocate 200MB-2GB
>     contiguous extents, and with any luck much larger ones.
>     Conversely any larger intervals means potentially losing
>     200MB-2GB of data. Sure, if they did not want to lose the
>     data the user process should be doing 'fdatasync()', but XFS
>     in particular is sort of pretty good at doing a mild version
>     of 'O_PONIES' where there is a balance between going as fast
>     as possible (buffer a lot in memory) and offering some
>     level of safety (as shown in the tests I did for a fair
>     comparison with 'ext3').

On a PC, that "loosing 2GB of data" is loosing a single file under 
normal use. It's quite seldom that people are copying data around. And 
even if, when the crash happens they usually know what they just did, 
and restart the copy after a crash.

If we speak about a server normally there should be a HW RAID card in it 
with good cache, and then it's true you should limit Linux write cache 
and flush early and often, as the card has BBWC and therefore data is 
protected once in the RAID card. People tend to forget to set writeback 
lower when using RAID controllers + BBWC, and it's almost nowhere 
documented. Maybe good for a FAQ entry on XFS, even if it's not XFS 

I wonder if there is a good document for "best practise" on VMs? I've 
never seen someone testing a VMware/XEN host with 20 Linux VMs, and what 
the settings should be for vm.dirty* and net.ipv4.* values. I've seen 
crashes on VM servers, where afterwards databases in VMs were broken 
despite using a RAID card +BBWC...
>   * A file that is written slowly in small chunks. Well,
>     nothing will help that except preallocate or space
>     reservations.

Now for a common webserver we use, as a guideline there are about 8 
uploads parallel all the time. Most of them are slow, as people are on 
ADSL. If you sync quite often, you're lucky when using XFS to get 
preallocation and all that. Otherwise, you'd have chunks of all files 
scattered on disk.

mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services: Protéger
http://proteger.at [gesprochen: Prot-e-schee]
Tel: +43 660 / 415 6531

// Haus zu verkaufen: http://zmi.at/langegg/

Attachment: signature.asc
Description: This is a digitally signed message part.

<Prev in Thread] Current Thread [Next in Thread>