----- Original Message -----
> From: "Joe Landman" <joe.landman@xxxxxxxxx>
> > takes. The folders are image folders that have anywhere between 5 to
> > 10 million images in each folder.
> The combination of very large folders, and virtualization is working
> against you. Couple that with an old (ancient by Linux standards) xfs
> in the virtual CentOS 5.9 system, and you aren't going to have much
> joy with this without changing a few things.
> Can you change from one single large folder to a heirarchical set of
> folders? The single large folder means any metadata operation (ls,
> stat, open, close) has a huge set of lists to traverse. It will work,
> albiet slowly. As a rule of thumb, we try to make sure our users don't
> go much beyond 10k files/folder. If they need to, building a heirarchy
> of folders slightly increases management complexity, but keeps the
> lists that are needed to be traversed much smaller.
> A strategy for doing this: If your files are named "aaaa0001"
> "aaaa0002" ... "zzzz9999" or similar, then you can chop off the first
> letter, and make a directory of it, and then put all files starting
> with that letter in that directory. Then within each of those directories,
> do the same thing with the second letter. This gets you 676
> directories and about 15k files per directory. Much faster directory
> Much smaller lists to traverse.
While this problem isn't *near* as bad on XFS as it was on older filesystems,
where over maybe 500-1000 files would result in 'ls' commands taking
over a minute...
It's still a good idea to filename hash large collections of files of
similar types into a directory tree, as Joe recommends. The best approach
I myself have seen to this is to has a filename of
Going as deep as necessary to reduce the size of the directories. What
you lose in needing to cache the extra directory levels outweighs (probably
far outweighs) having to handle Directories Of Unusual Size.
Note that I didn't actually trim the filename proper; the final file still has
its full name. This hash is easy to build, as long as you fix the number of
in advance... and if you need to make it deeper, later, it's easy to build a
shell script that crawls the current tree and adds the next layer.
Jay R. Ashworth Baylink jra@xxxxxxxxxxx
Designer The Things I Think RFC 2100
Ashworth & Associates http://baylink.pitas.com 2000 Land Rover DII
St Petersburg FL USA #natog +1 727 647 1274