netdev
[Top] [All Lists]

Re: [PATCH 2/4] deferred drop, __parent workaround, reshape_fail

To: Harald Welte <laforge@xxxxxxxxxxxxx>
Subject: Re: [PATCH 2/4] deferred drop, __parent workaround, reshape_fail
From: jamal <hadi@xxxxxxxxxx>
Date: 16 Aug 2004 09:29:59 -0400
Cc: sandr8@xxxxxxxxxxxx, devik@xxxxxx, netdev@xxxxxxxxxxx, netfilter-devel@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20040816073530.GI15418@sunbeam2>
Organization: jamalopolous
References: <411C0FCE.9060906@crocetta.org> <1092401484.1043.30.camel@jzny.localdomain> <411CCB98.4080904@crocetta.org> <1092518370.2876.3.camel@jzny.localdomain> <20040816073530.GI15418@sunbeam2>
Reply-to: hadi@xxxxxxxxxx
Sender: netdev-bounce@xxxxxxxxxxx
On Mon, 2004-08-16 at 03:35, Harald Welte wrote:
> On Sat, Aug 14, 2004 at 05:21:31PM -0400, jamal wrote:
> 

> > Also is their a corrective factor that happens once the _accounting_ 
> > data has been shipped? Example: 
> > - account for packet
> > - ship accounting data to some billing server
> > - oops, unbill
> > - what now? 
> 
> Yes, this is a race condition.  However, I don't this is not very likely
> to occurr, since the accounting data is by default only sent to
> userspace via ctnetlink once the connection tracking entry is deleted.

ah, ok. so problem solved then. 

> Yes, you can read it while the connection is still alive, but this will
> not reset/update the counters, but rather give you a current snapshot.
> If you send this to your accounting server, the accounting server has to
> cope with the fact that this intermediate snapshot can be updated by
> some later data.  It SHOULD not care whether this later data for the
> same flow has bigger or smaller byte/packet counters. [and
> it is very unlikely that the total will be lower, since then in the
> timeframe [snapshot, terminations] more packets have to be dropped than
> accepted.  Still, if this is documented with ctnetlink I'm perfectly
> fine which such behaviour.

I am too. Good stuff.
I think 99.9% of accounting would be happy with getting data after the
call is done; the other 0.01% may have to live with extra packets later
which undo things. 
Are you working on something along the IPFIX protocol for transport?

> > BTW, what happens if you clone the packet below netfilter and send
> > several copies of it possibly over several different interfaces? This
> > may happen with tc extensions.
> 
> oh yes, I think somebody has written a similar iptables target, too. I'm
> not sure whether there is a good solution for the 'unbill' feature.  Do
> you have any thoughts/recommendations for this?

Let me think about it.
Clearly the best place to account for things is on the wire once the
packet has left the box ;-> So the closest you are to the wire, the
better. How open are you to move accounting further down? My thoughts
are along the lines of incrementing the contrack counters at the qdisc
level. Since you transport after the structure has been deleted, it
should work out fine and fair billing will be taken care of.

> The reason for not delaying accounting update until qdisc has happened
> is locking.  Then we would have to re-grab the conntrack write lock to
> make the counter update... whrereas in my patch counter updates happen
> while we are already under write lock for the timer/timeout update.

Yikes. That sort of kills my thought above ;->
Has someone done experimented and figured how expensive it would be to
do it at the qdisc level? Note, you can probably have a low level
grained
lock just for stats.

> > Lets talk about this issue first instead of confusing it with everything
> > else you have in other patches.
> 
> Also, if this 'unbill' feature gets into the kernel in some form, I
> would definitely make it a CONFIG_ or sysctl... after all people could
> be interested in the Rx side only...

Agreed.

Heres thinking developed while responding to you.
Contracking to use generic stats counters that we plan to use for MPLS
(amongst other things). I have attached a patch i was going to shoot to
Dave - needs testing against latest kernels. The code is reproduced from
the net/sched.

My thinking is that at the qdisc level instead of saying things like:
sch->stats.packets++;
you do:
INC_STATS(skb,sch->stats,reason_code)

INC_STATS is generic (goes in the generic stats code in attached patch)
and will have an ifdef for contrack billing (which includes unbilling
depending on reason code). Reason code could be results that are now
returned.
As an example NET_XMIT_DROP is definetely unbilling while
NET_XMIT_SUCCESS implies bill. 

I think the current stats structure may not cover all cases but we can
discuss that before i push patch to Dave.

cheers,
jamal



Attachment: stats1.patch
Description: Text document

<Prev in Thread] Current Thread [Next in Thread>