[Top] [All Lists]

Seeking XFS tuning advice for PostgreSQL on SATA SSDs/Linux-md

To: xfs@xxxxxxxxxxx
Subject: Seeking XFS tuning advice for PostgreSQL on SATA SSDs/Linux-md
From: Johannes Truschnigg <johannes.truschnigg@xxxxxxxxxxx>
Date: Tue, 15 Apr 2014 14:23:07 +0200
Delivered-to: xfs@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20131103 Icedove/17.0.10
Hi list,

we're building a postgres streaming replication slave that's supposed to pick up work if our primary pg cluster (with an all-flash FC SAN appliance as its backend store) goes down. We'll be using consumer-grade hardware for this setup, which I detail below:

o 2x Intel Xeon E5-2630L (24 threads total)
o Intel C606-based Dual 4-Port SATA/SAS HBA (PCIID 8086:1d68)
o 6x Samsung 830 SSD with 512GB each, 25% reserved for HPA
o Debian GNU/Linux 7.x "Wheezy" + backports kernel (3.13+)
o PostgreSQL 9.0

If there's anything else that is of critical interest that I forgot to mention, hardware- or software-wise, please let me know.

When benchmarking the individual SSDs with fio (using the libaio backend), the IOPS we've seen were in the 30k-35k range overall for 4K block sizes. The host will be on the receiving end of a pg9.0 streaming replication cluster setup where the master handles ~50k IOPS peak, and I'm thinking what'd be a good approach to design the local storage stack (with availability in mind) in a way that has a chance to keep up with our flash-based FC SAN.

After digging through linux-raid archives, I think the most sensible approach are two-disk pairs in RAID1 that are concatenated via either LVM2 or md (leaning towards the latter, since I'd expect that to have a tad less overhead), and xfs on top of that resulting block device. That should yield roughly 1.2TB of usable space (we need a minimum of 900GB for the DB). With this setup, it should be possible to have up to 3 CPUs busy with handling I/O on the block side of things, which raises the question what'd be a sensible value to choose for xfs' Allocation Group Count/agcount.

I've been trying to find information on that myself, but what I managed to dig up is, at times, so old that it seems rather outlandish today - some sources on the web (from 2003), for example, say that one AG per 4GB of underlying diskspace makes sense, which seems excessive for a 1200GB volume.

I've experimented with mkfs.xfs (on top of LVM only; I don't know if it takes into account lower block layers and seen that it supposedly chooses to default to an agcount of 4, which seems insufficient given the max. bandwidth our setup should be able to provide.

Apart from that, is there any kind of advice you can share for tuning xfs to run postgres (9.0 initially, but we're planning to upgrade to 9.3 or later eventually) on in 2014, especially performance-wise?

Thanks, regards:
- Johannes

<Prev in Thread] Current Thread [Next in Thread>