On Tue, 2004-08-17 at 09:40, sandr8 wrote:
Yes, this is a hard question. Did you see the suggestion i proposed
if it is the centralization of the stats with the reason code that,
for what concerns the ACCT, says wheter to bill or unbill i
think it is _really_ great :)
still, for what concerns the multiple interface delivery of the
same packet i don't see how it would be solved...
Such packets are cloned or copied. I am going to assume the contrack
data remains intact in both cases. LaForge?
BTW, although i mentioned the multiple interfaces as an issue - thinking
a little more i see retransmissions from TCP as well (when enqueue drops
because of full queue) being a problem.
imho, the point maybe is that a scheduler should work at layer 3,
am i wrong?
i mean: i made the same question to myself and answered that
the right level should be the third... this would account for tcp
retransmissions as well as forward error corrections packets
added by some application on top of udp, or retrasmissions by
applications on udp... or... whatever...
maybe my reasoning is less foggy if i first answer to an other
I think the issue starts with defining what resource is being accounted
for. In my view, you are accounting for both CPU and bandwidth.
Lets start by asking What is the resource being accounted for?
i would like to account for the number of bytes sent to the wire
on behalf of each flow :)
Haralds patch bills in that case as well for each retransmitted packet
that gets dropped because of full Q.
So best place to really unbill is at the qdisc level.
The only place for now i see that happening is in the case of drops
The dev.c area after enqueue() attempt is a more dangerous place to do
it at (incase the skb doesnt exist because something freed it when
enqueue was called. Also because thats one area open for returning more
intelligent congestion level indicating codes)
this seems not to be possible, afaik, with that patch i wrote. the only
skbs that are freed are those that are internally allocated. and the
only kfree_skb() that can happen on skbs that are enqueued in dev.c
should be those in case od a TC_ACT_QUEUED or TC_ACT_STOLEN,
where they should just decrement the user counter. i say "should" since
this is the most reasonable assumption i managed to make, but
this is your field and you definitely know it much better than me :)
in that case, btw, dev.c doesn't get any drop return code...
if a drop return code is given, the packet is not freed internally, but
only "externally". (for the "where"... the question is open in "billing 1")
where could a skb be freed then?
[ i'm not insisting with that patch, i'm just trying to say that, if i don't
rave, it should not be dangerous to do that after the enqueue()...
then, it's just that for the moment i can't immagine a different
way to do things in that place :) yes, there could be a slight
variation with a skb_get() right before the enqueue and all the
kfree_skb() at their place inside the code, but then somewhere
we should always add a kfree_skb... ouch... and we would need
a by ref skb anyway to get the packet that has been dropped
and if it's not the same as the enqueued one also the enqueued
one should pass through one more kfree_skb()... horrible, more
complex and slower i'd say... ]
would there be any magic to have some conntrack data per device
without having to execute the actual tracking twice but without locking
the whole conntrack either?
That is the challenge at the moment.
For starters i dont see it as an issue right now to do locking.
Its a cost for the feature since Haralds patch is in.
In the future we should make accounting a feature that could be turned
on despite contracking and skbs should carry an accounting metadata with
i need to think thoroughly on it... depending on where that information is
kept, the complexity of some operations can change a lot... and i should
not only egoistically think to the operations i need but look at it from the
outside to have a less partisan viewpoint on the problem and find the
most generally good solution possible.
what could be the "magic" to let the
conntrack do the hard work just once and handle the additional traffic
policing information separately, in an other data structure that is
on a device basis? that could also be the place where to count how much
a given flow is backlogged on a given interface... which could help in
choosing the dropping action... sorry, am i going too much further?
No i think your idea is valuable.
The challenge is say you have a million connections, then do you
have a million locks (one per structure)? I think we could reduce it
by having a pool of stats sharing a lock (maybe by placing them in a
shared hash table with each bucket having a lock).
yeah, that could be the right compromise :)
this is the case. i could do it on my own from inside my code, but then
"pollute" the information seen from other parts of the kernel code and i
You cant have too many locks and you cant have too few ;->
On your qdisc you say:
it is not ready, but to say it shortly, i'm trying to serve first who
has been _served_ the less.
from the first experiments i have made this behaves pretty well and smootly,
but i've noticed that _not_ unbilling can be pretty unfair towards udp
since they always keep sending.
If qdisc drops on full Q and unbills i think it should work, no?
introduce a _new_ unfairness between those flows that pass though my qdisc
and those that don't... to sum it up... it would be pretty unclean
you are thinking in that perspective because of tcp? as i said above, i
would stop at layer 3...
btw, if i don't misunderstand what you mean, i guess it's when tcp is
retransmitting that that
field should somehow be set... is it as feasible when we are not on an
end-point as when
we are an endpoint? btw, we should then do the same with the other
protocols and, for
If it drops because they abused a bandwidth level, shouldnt you punish
them still? I think you should, but your mileage may vary.
Note you also dont want to unbill more than once. If not maybe you can
introduce something on the skb to indicate unbilling-happened (if done
by policer) so root qdisc doesnt unbill again.
example with udp, it would become application dependent... a suicide?
it simply has a priority dequeue that is manained ordered on the
if no drop occours, then accounting before enqueueing simply forecasts
that will have been attained up to the packet currenlty being enqueued
when it will
be dequeued. [ much easier to code than to say... ]
I think i understand.
A packet that gets enqueued is _guaranteed_ to be transmitted unless
overulled by admin policy.
Ok, how about the idea of adding skb->unbilled which gets set when
unbilling happens (in the aggregated stats_incr()). skb->unbilled gets
zeroed at the root qdisc after return from enqueueing.
sorry?? i'm lost... maybe there's something implied i can't get...
do you agree it's not the same skb that will be re-billed