netdev
[Top] [All Lists]

Re: [PATCH] Improve behaviour of Netlink Sockets

To: jamal <hadi@xxxxxxxxxx>
Subject: Re: [PATCH] Improve behaviour of Netlink Sockets
From: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
Date: Tue, 28 Sep 2004 21:11:59 +1000
Cc: Pablo Neira <pablo@xxxxxxxxxxx>, "David S. Miller" <davem@xxxxxxxxxx>, netdev@xxxxxxxxxxx
In-reply-to: <1096367787.8662.146.camel@jzny.localdomain>
References: <20040924032440.GB6384@gondor.apana.org.au> <1096289189.1075.37.camel@jzny.localdomain> <20040927213607.GD7243@gondor.apana.org.au> <1096339407.8660.33.camel@jzny.localdomain> <20040928024614.GA9911@gondor.apana.org.au> <1096340772.8659.51.camel@jzny.localdomain> <20040928032321.GB10116@gondor.apana.org.au> <1096343125.8661.96.camel@jzny.localdomain> <20040928035921.GA10675@gondor.apana.org.au> <1096367787.8662.146.camel@jzny.localdomain>
Sender: netdev-bounce@xxxxxxxxxxx
User-agent: Mutt/1.5.6+20040722i
On Tue, Sep 28, 2004 at 06:36:27AM -0400, jamal wrote:
> 
> er, what about the host scope route msgs generated by same script? ;->

You mean rtmsg_fib()? It also allocates a size much less than
NLM_GOODSIZE, namely sizeof(struct rtmsg) + 256.  However, the
trim function might be able to shave a bit off there since the
256 bytes is mostly reserved for multiple nexthops.

AFAIK, no async netlink event function uses NLM_GOODSIZE at all for
obvious reasons.
 
> The state is per socket. You may need an intermediate queue etc which
> feeds to each user socket registered for the event. The socket queue
> acts as a essentially a retransmitQ for broadcast state. Just waving my
> hands throwing ideas here of course.

Aha, you've fallen into my trap :) Now let me demonstrate why having
an intermediate queue doesn't help at all.

Holding the packet on the intermediate queue is exactly the same as
holding it on the receive queue of the destination socket.  The reason
is that we're simply cloning the packets.  So moving it from one queue
to another does not reduce system resource usage by much.

There is the cost in cloning the skbs.  However, that's an orthogonal
issue altogether.  We can reduce the cost there by making the packets
bigger.  This can either be done at the sender end by coalescing
successive messages.  Or we can merge them in netlink_broadcast.

Granted having an intermediate queue will avoid overruns if it is
large enough.  However, having all the receive queues to be as big
as your intermediate queue will have exactly the same effect.

In fact this has an advantage over the intermediate queue.  With the
latter, you need to hold the packet in place whether the applications
need it or not.  While currently, the application can choose whether
it wants to receive a large batch of events and if so how large.

Remember just because one application overruns, it doesn't mean that
the other recipients of the same skb will overrun.  They can continue
to receive messages as long as their receive queue allows it.

So applications that really want to see every event should have a
very large receive queue.  Those that can recover easily should use
with a much smaller queue.

> We cant control the user space script for example that caused those
> events. We shouldnt infact. Congestion control in this context equates
> to desire to not overlaod the reader (of events). In other words if you
> know the reader's capacity for swallowing events, then you dont exceed
> that rate of sending to said reader. Knowing the capacity requires even
> more state:

I understand what you mean.  But unless you can quench the source of
the messages/events, all you can do is batch them up somewhere.  What
I'm arguing about is that batching them up in the middle is no better
than batching them up at the destination.  In fact it's worse in that
it takes away choice from the receiving application.

> any waiting socket. Just overrun them immediately and they (readers) get
> forced to reread the state. Of course this queue will have to be larger
> than any of the active sockets recv queues.

Jamal, maybe I've got the wrong impression but it almost seems
that you think that if one applications overruns, then everyone
else on that multicast address will overrun as well.  This is
definitely not the case.

With an intermediate queue, you will in fact impose overruns on
everyone when it overflows which seems to be a step backwards.

> The moral of this is: you could do it if you wanted - aint trivial.

Well this is not what I'd call congestion control :) Let's take a
TCP analogy.  This is like batching up TCP packets on a router in
the middle rather than shutting down the sender.  Congestion control
is where you shut the sender up.
 
> Except if you drop incoming bug reports and drop them early (point of
> that intermidiate queue).

Nah.  Dropping them early is overruning early.

> BTW, Davem gets away with this congestion control alg all the time.
> Heck i think his sanity survives because of it - I bet you hes got this
> thread under congestion controlled right now;->

Yep, his overrun flag sure is set :)
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@xxxxxxxxxxxxxxxxxxx>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

<Prev in Thread] Current Thread [Next in Thread>