[Top] [All Lists]

Re: xfs open questions

To: xfs@xxxxxxxxxxx
Subject: Re: xfs open questions
From: Michael Monnerie <michael.monnerie@xxxxxxxxxxxxxxxxxxx>
Date: Wed, 28 Jan 2009 09:37:19 +0100
In-reply-to: <497F130F.4010107@xxxxxxxxxxx>
Organization: it-management http://it-management.at
References: <200901270928.29215@xxxxxx> <497F130F.4010107@xxxxxxxxxxx>
User-agent: KMail/1.10.3 (Linux/; KDE/4.1.3; x86_64; ; )
On Dienstag 27 Januar 2009 Eric Sandeen wrote:
> I think that's all correct.  It's basically this: stripe unit is
> per-disk, sripe width is unit*data_disks.  And then there's the added
> bonus of the differing units on su/sw vs. sunit/swidth.  :)

Yes, it's always fun to have it complicated.

> I'd love to be able to update these pdf files, but despite asking for
> the source document several times over a couple months, nothing has
> been provided.  Unfortunately 'til then it's up to SGI to update them
> and the community can't help much (SGI: hint, hint).

SGI, do you read that? Your users want to do your work, you should be 
happy 'bout that!

> > Documentation says some backup tools can't handle 64bit Inodes, are
> > there problems with other programs as well?
> Potentially, yes:
> http://sandeen.net/wordpress/?p=9

OK, so is there anything really problematic? I did already mount with 
inode64, is there a way back?

> I would not get overly concerned with AG count; newer mkfs.xfs has
> lower defaults (i.e. creates larger AGs, 4 by default, even for a 2T
> filesystem) but to some degree what's "best" depends both on the
> storage underneath and the way the fs will be used.
> But with defaults, your 2T/4AG filesystem case above would grow to
> 3T/6AGs, which is fine for many cases.

I used 40 AG's, so it will be 60 then.

> Hm it's unfortunate that there are no units on that number.  Easy to
> fix.

For who? Are you one of the devs?

> This is to avoid all metadata landing on a single disk; similar to
> how mkfs.ext3 wants to use "stride" in its one geometry-tuning knob.

I find the concept good, just the doc about it should be better/clearer. 
And why didn't they choose to specify # of disks, swidth has to be a 
multiple of sunit anyway? With su=65536,sw=3 it's at least clearer.

> The defaults were recently moved to be lower (4 by default).  Files
> in new subdirs are rotated into new AGs, all other things being equal
> (space available, 64-bit-inode allocator mode).  To be honest I don't
> have a good answer for you on when you'd want more or fewer AGs,
> although AGs are parallel independent chunks of the fs to large
> degree, so in some cases, more AGs may help certain kinds of parallel
> operations.  Perhaps others can chime in a bit more on this tuning


> > - PostgreSQL
> > The PostgreSQL database creates a directory per DB. From the docs I
> > read that this creates all Inodes within the same AG. But wouldn't
> > it be better for performance to have each table on a different AG?
> > This could be manually achieved manually, but I'd like to hear if
> > that's better or not.
> Hm, where in the docs, just to be clear?

I meant the XFS docs - you said it again "files in the same dir are kept 
in the same AG". I wasn't clear enough on that. PostgreSQL creates a new 
DB in a new dir, but all tables etc. are just new files within that dir, 
which is probably not exactly what one wants. I can imagine it helps to 
stop the DB, "cp" all files to an extra dir once, to get them 
distributed over the AG's, and then "mv" them back so just the dir entry 
is moved but the files stays at that other AG. Or does that sound silly?

> Note how the AG rotors around my 4 AGs in the filesystem.  If the fs
> is full and aged, it may not behave exactly this way.

Does anybody have experience with an aged XFS? I've had the problem with 
reiserfs, that it became slow after a certain point. It took ages to 
search within a directory of MP3's containing 50.000 entries in many 

> I don't have specific experience w/ PostgreSQL but if you have
> specific questions or performance problems that you run into, we can
> probably help.

Just the idea from above, to cp & mv all files once after creating to 
distribute among AG's. There was just a general discussion whether XFS 
is good for postgres or not. Could have been there's that "super-trick" 
to speed up things by a factor of 1000... ;-)

> All good questions, thanks.


mfg zmi
// Michael Monnerie, Ing.BSc    -----      http://it-management.at
// Tel: 0660 / 415 65 31                      .network.your.ideas.
// PGP Key:         "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38  500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net                  Key-ID: 1C1209B4

Attachment: signature.asc
Description: This is a digitally signed message part.

<Prev in Thread] Current Thread [Next in Thread>