On 09/26/2013 06:16 PM, Dave Chinner wrote:
Virtualisation will have nothing to do with the problem. *All* my
YMMV. Very heavy IO in KVM/Xen often results in some very interesting
performance anomolies from the testing we've done on customer use cases.
And, well, I can boot a virtualised machine in under 7s, while a
physical machine reboot takes about 5 minutes, so there's a massive
win in terms of compile/boot/test cycle times doing things this way.
Certainly I agree with that aspect. Our KVM instances reboot and reload
very quickly. This is one of their nicest features. One we use for
First and foremost:
Can you change from one single large folder to a heirarchical set of
folders? The single large folder means any metadata operation (ls,
stat, open, close) has a huge set of lists to traverse. It will
work, albiet slowly. As a rule of thumb, we try to make sure our
users don't go much beyond 10k files/folder. If they need to,
building a heirarchy of folders slightly increases management
complexity, but keeps the lists that are needed to be traversed much
I'll just quote what I told someone yesterday on IRC:
A strategy for doing this: If your files are named "aaaa0001"
"aaaa0002" ... "zzzz9999" or similar, then you can chop off the
first letter, and make a directory of it, and then put all files
starting with that letter in that directory. Then within each of
those directories, do the same thing with the second letter. This
gets you 676 directories and about 15k files per directory. Much
faster directory operations. Much smaller lists to traverse.
But that's still not optimal, as directory operations will then
serialise on per AG locks and so modifications will still be a
bottleneck if you only have 4 AGs in your filesystem. i.e. if you
are going to do this, you need to tailor the directory hash to the
concurrency the filesystem structure provide because more, smaller
directories are not necessarily better than fewer larger ones.
Indeed, if you're workload is dominated by random lookups, the
hashing technique is less efficient than just having one large
directory as the internal btree indexes in the XFS directory
structure are far, far more IO efficient than a multi-level
directory hash of smaller directories. The trade-off in this case is
lookup concurrency - enough directories to provide good llokup
concurrency, yet few enough that you still get the IO benefit from
the scalability of the internal directory structure.
This said, its pretty clear the OP is hitting performance bottlenecks.
While the schema I proposed was non-optimal for the use case, I'd be
hard pressed to imagine it being worse for his use case based upon what
Obviously, more detail on the issue is needed.