xfs open questions
Michael Monnerie
michael.monnerie at is.it-management.at
Tue Jan 27 02:28:23 CST 2009
Dear list,
I'm new here, experienced admin, trying to understand XFS correctly.
I've read
http://xfs.org/index.php/XFS_Status_Updates
http://oss.sgi.com/projects/xfs/training/index.html
http://en.wikipedia.org/wiki/Xfs
and still have some xfs questions, which I guess should be in the FAQ
also because they were the first questions I raised when trying XFS. I
hope this is the correct list to ask this, and hope this very long first
mail isn't too intrusive:
- Stripe Alignment
It's very nice to have the FS understand where it runs on, and that you
can optimize for it. But the documentation on how to do that correctly
is incomplete.
http://oss.sgi.com/projects/xfs/training/xfs_slides_04_mkfs.pdf
On page 5 is an example an an "8+1 RAID". Does it mean "9 disks in
RAID-5"? So 8 are data and 1 is parity, and for XFS only the data disks
are important?
If so, when I have a 8 disks RAID 6 (where 2 are parity, 6 data) and a 8
disks RAID-50 (again 2 parity, 6 data) would be the same?
Let's say I have 64k stripe size on the RAID controller, with above 8
disks RAID 6. So best performance would be
mkfs -d su=64k,sw=$((64*6))k
is that correct? It would be good if there's clearer documentation with
more examples.
- 64bit Inodes
On the allocator's slides
http://oss.sgi.com/projects/xfs/training/xfs_slides_06_allocators.pdf
it's said that if the volume is >1TB, 32bit Inodes make the FS suffer,
and that 64bit Inodes should be used. Is that a safe function?
Documentation says some backup tools can't handle 64bit Inodes, are
there problems with other programs as well? Is the system fully
supporting 64bit Inodes? 64bit Linux kernel needed I guess?
And if I already created a FS >1TB with 32bit Inodes, it would be better
to recreate it with 64bit Inodes and restore all data then?
- Allocation Groups
When I create a XFS with 2TB, and I know it will be growing as we expand
the RAID later, how do I optimize the AG's? If I now start with
agcount=16, and later expand the RAID +1TB so having 3 instead 2TB, what
happens to the agcount? Is it increased, or are existing AGs expanded so
you still have 16 AGs? I guess that new AG's are created, but it's
nowhere documented.
- mkfs warnings about stripe width multiples
For a RAID 5 with 4 disks having 2,4TB on LVM I did:
# mkfs.xfs -f -L oriondata -b size=4096 -d su=65536,sw=3,agcount=40 -i
attr=2 -l lazy-count=1,su=65536 /dev/p3u_data/data1
Warning: AG size is a multiple of stripe width. This can cause
performance problems by aligning all AGs on the same disk. To avoid
this, run mkfs with an AG size that is one stripe unit smaller, for
example 13762544.
meta-data=/dev/p3u_data/data1 isize=256 agcount=40,
agsize=13762560 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=550502400,
imaxpct=5
= sunit=16 swidth=48 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=32768, version=2
= sectsz=512 sunit=16 blks, lazy-
count=1
realtime =none extsz=4096 blocks=0, rtextents=0
and so I did it again with
# mkfs.xfs -f -L oriondata -b size=4096 -d
su=65536,sw=3,agsize=13762544b -i attr=2 -l lazy-count=1,su=65536
/dev/p3u_data/data1
meta-data=/dev/p3u_data/data1 isize=256 agcount=40,
agsize=13762544 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=550501760,
imaxpct=5
= sunit=16 swidth=48 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=32768, version=2
= sectsz=512 sunit=16 blks, lazy-
count=1
realtime =none extsz=4096 blocks=0, rtextents=0
It would be good if mkfs would correctly says "... run mkfs with an AG
size that is one stripe unit smaller, for example 13762544b". The "b" at
the end is very important, that cost me a lot of search in the
beginning.
Is there a limit on the number of AG's? Theoretical and practical? Is
there a guideline how many AGs to use? Depending on CPU cores, or number
of parallel users, or spindles, or something else? Page 4 of the mkfs
docs (link above) says "too few or too many AG's should be avoided", but
what numbers are "few" and "many"?
- PostgreSQL
The PostgreSQL database creates a directory per DB. From the docs I read
that this creates all Inodes within the same AG. But wouldn't it be
better for performance to have each table on a different AG? This could
be manually achieved manually, but I'd like to hear if that's better or
not.
Or are there other tweaks to remember when using PostgreSQL on XFS? This
question was raised on the PostgreSQL admin list, and if there are good
guidelines I'm happy to post them there.
mfg zmi
--
// Michael Monnerie, Ing.BSc ----- http://it-management.at
// Tel: 0660 / 415 65 31 .network.your.ideas.
// PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part.
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20090127/17c08536/attachment.sig>
More information about the xfs
mailing list