On Mon, Mar 21, 2005 at 04:25:56PM -0500, John Heffner wrote:
> On Sat, 19 Mar 2005, Andi Kleen wrote:
> > Stephen Hemminger <shemminger@xxxxxxxx> writes:
> > > Since developers want to experiment with different congestion
> > > control mechanisms, and the kernel is getting bloated with overlapping
> > > data structure and code for multiple algorithms; here is a patch to
> > > split out the Reno, Vegas, Westwood, BIC congestion control stuff
> > > into an infrastructure similar to the I/O schedulers.
> > [...]
> > Did you do any benchmarks to check that wont slow it down?
> > I would recommend to try it on a IA64 machine if possible. In the
> > past we found that adding indirect function calls on IA64 to networking
> > caused measurable slowdowns in macrobenchmarks.
> > In that case it was LSM callbacks, but your code looks like it will
> > add even more.
> Is there a canonical benchmark?
For the LSM case we saw the problem with running netperf over loopback.
It added one or two hooks per packet, but it already made a noticeable
difference on IA64 boxes.
On other systems it is unnoticeable.
> Would you really expect a single extra indirect call per ack to have a
> significant performance impact? This is surprising to me. Where does the
> cost come from? Replacing instruction cache lines?
I was never quite clear. Some instruction stalls in the CPUs.
One not very good theory was that McKinley really likes
to have its jump registers loaded early for indirect calls, and gcc
doesn't even attempt this.