[Top] [All Lists]

Re: XFS: Abysmal write performance because of excessive seeking (allocat

To: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Subject: Re: XFS: Abysmal write performance because of excessive seeking (allocation groups to blame?)
From: Stefan Ring <stefanrin@xxxxxxxxx>
Date: Fri, 6 Apr 2012 10:25:14 +0200
Cc: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=OAXrdOq8vHcQpfpJjDXHnDixrM5WDh7NhP9PKQjLVEg=; b=TZiFxflB++j9iybCDNprzEUp55HFB+BqwKA8Ng1QmX2e+NVmOYOSqXLPvHq8jU+8ny rRniD5rE/tkz9RHjnmxqvMZP6rC3IVSJ45cJNbvFNp7rhfpxOfu0+fTFFVe9kC80RBEy CRg/Cq563Zbl9zTGAIzwAuGn9QbUYqOZoVs5y2ktUgbDqxOp3sLM5tFCYXZBkVhLD52I a3ugFqRhNHs1/aLL5CCQsEvgMv20hF9sB/ICLBjR/rUxBx4n0iEZW4N3oxsleFcPRxiy KFO6GyTqvwOPnMIWLoJ/Kv/pZfe1Q0e5vOSahj92zZDePOvmUeX7Fo+cUjMxMYvbbI+I p4Hw==
In-reply-to: <20120405213740.GA22824@xxxxxxxxxxxxx>
References: <CAAxjCEwBMbd0x7WQmFELM8JyFu6Kv_b+KDe3XFqJE6shfSAfyQ@xxxxxxxxxxxxxx> <20120405213740.GA22824@xxxxxxxxxxxxx>
> thanks for the detailed report.

Thanks for the detailed and kind answer.

> Can you try a few mount options for me both all together and if you have
> some time also individually.
>  -o inode64
>        This allows inodes to be close to data even for >1TB
>        filesystems.  It's something we hope to make the default soon.

The filesystem is not that large. It’s only 400GB. I turned it on
anyway. No difference.

>  -o filestreams
>        This keeps data written in a single directory group together.
>        Not sure your directories are large enough to really benefit
>        from it, but it's worth a try.
>  -o allocsize=4k
>        This disables the agressive file preallocation we do in XFS,
>        which sounds like it's not useful for your workload.

inode64+filestreams: no difference
inode64+allocsize: no difference
inode64+filestreams+allocsize: no difference :(

> For metadata intensive workloads like yours you would be much better
> using a non-striping raid, e.g. concatentation and mirroring instead of
> raid 5 or raid 6.  I know this has a cost in terms of "wasted" space,
> but for IOPs bound workload the difference is dramatic.

Hmm, I’m sure you’re right, but I’m out of luck here. If I had 24
drives, I could think about a different organization. But with only 6
bays, I cannot give up all that space.

Although *in theory*, it *should* be possible to run fast for
write-only workloads. The stripe size is 64 KB (4x16), and it’s not
like data is written all over the place. So it should very well be
possible to write the data out in some reasonably sized and aligned
chunks. The filesystem partition itself is nicely aligned.

<Prev in Thread] Current Thread [Next in Thread>