On Fri, Feb 14, 2003 at 04:49:14PM -0600, Steve Lord wrote:
> On Fri, 2003-02-14 at 16:43, Andi Kleen wrote:
> > > linux/fs/xfs/pagebuf/page_buf.c - 1.94
> > > - base the number of hash buckets for xfs metadata on the amount of
> > > memory in the system, and use the same hash algorithm as the
> > > regular buffer cache.
> >
> > Hi,
> >
> > I cannot review the change because it hasn't reached the public
> > CVS server yet. But in general I would be very careful with existing
> > hash functions in Linux. Lots of them are suffering from the
> > "i don't use a prime modulo, but some cheap binary modulo" problem,
> > causing bad hashing, which is normally papered around with making the
> > cache size resizing overly aggressive, guaranteeing slow cache misses
> > when the hash table is accessed.
> >
> > e.g.
> >
> > Buffer-cache hash table entries: 65536 (order: 6, 262144 bytes)
> >
> > That's far too big, definitely guaranteeing cache misses for most
> > bucket accesses even on an Itanium 2 ;)
> >
> > Probably you're better off with a much smaller table and a better hash
> > function
> > using primes.
>
> Well, this one was actually better than the existing one - as measured
> by a profiler while running spec. But yes, it definitely needs to be
> capped. I did put an upper ceiling on the size, probably still too
> aggressive though. As for hash algorithm, its the one from fs/buffer.c
See
http://www.citi.umich.edu/projects/linux-scalability/reports/hash.html
for more details. It also gives some suggestions on how to improve it.
I think Linux still uses the hash functions criticized in there.
-Andi
|