xfs
[Top] [All Lists]

Re: TAKE - improve xfs metadata hashing

To: Steve Lord <lord@xxxxxxx>
Subject: Re: TAKE - improve xfs metadata hashing
From: Andi Kleen <ak@xxxxxxx>
Date: Sat, 15 Feb 2003 00:15:28 +0100
Cc: Andi Kleen <ak@xxxxxxx>, linux-xfs@xxxxxxxxxxx
In-reply-to: <1045262954.15864.18.camel@xxxxxxxxxxxxxxxxxxxx>
References: <200302142225.h1EMPKo22498@xxxxxxxxxxxxxxxxxxxx> <20030214224301.GA10746@xxxxxxxxxxxxx> <1045262954.15864.18.camel@xxxxxxxxxxxxxxxxxxxx>
Sender: linux-xfs-bounce@xxxxxxxxxxx
On Fri, Feb 14, 2003 at 04:49:14PM -0600, Steve Lord wrote:
> On Fri, 2003-02-14 at 16:43, Andi Kleen wrote:
> > > linux/fs/xfs/pagebuf/page_buf.c - 1.94
> > >   - base the number of hash buckets for xfs metadata on the amount of
> > >     memory in the system, and use the same hash algorithm as the 
> > >     regular buffer cache.
> > 
> > Hi,
> > 
> > I cannot review the change because it hasn't reached the public
> > CVS server yet. But in general I would be very careful with existing
> > hash functions in Linux. Lots of them are suffering from  the
> > "i don't use a prime modulo, but some cheap binary modulo" problem,
> > causing bad hashing, which is normally papered around with making the 
> > cache size resizing overly aggressive, guaranteeing slow cache misses
> > when the hash table is accessed.
> > 
> > e.g. 
> > 
> > Buffer-cache hash table entries: 65536 (order: 6, 262144 bytes)
> > 
> > That's far too big, definitely guaranteeing cache misses for most
> > bucket accesses even on an Itanium 2 ;)
> > 
> > Probably you're better off with a much smaller table and a better hash 
> > function
> > using primes.
> 
> Well, this one was actually better than the existing one - as measured
> by a profiler while running spec. But yes, it definitely needs to be
> capped. I did put an upper ceiling on the size, probably still too
> aggressive though. As for hash algorithm, its the one from fs/buffer.c

See 

http://www.citi.umich.edu/projects/linux-scalability/reports/hash.html

for more details. It also gives some suggestions on how to improve it.
I think Linux still uses the hash functions criticized in there.

-Andi


<Prev in Thread] Current Thread [Next in Thread>