On Wed, 5 Jan 2005, jamal wrote:
Ok, Iam confused - I thought you guys _wanted this_ ;->
I'm confused too now. We dont want "queue packets" - that's the
'newish' policy I was referring to. (newish in quotes, as I'm not
sure from when this behaviour dates).
The issue is about message obsolency more than it is about reliability.
So we do want what you think we want ;) (i think).
Without this the scenario you played you played for us before was:
1)=>send a few LSAs from user space, pull cable before they go out
(this will happen if you are sending sufficiently large amounts of
packets i.e network is busy),
2)=>user space gets notified via netlink, device shuts down acces to
DMA, packets queued anyways and you have no ability to say "ooops,sorry
take that packet back"
- we noticed this behaviour because of OSPF
Users reported ospfd would cease to send packets on all interfaces,
(with certain drivers) because /one/ interface was link-down.
We can workaround this easily by opening a socket per interface - at
present we simply punt OSPF packets down a single raw socket and rely
on IP_HDRINCL to have kernel route the packet out correct interface
(IP_MULTICAST_IF for multicast destined packets).
- The queueing does not affect OSPF terribly, it would affect other
OSPF implements its own 'synchronisation' facilities between
neighbours and can easily 'detect' obsolecent packets. So the
obsolence issue does not affect it, routing-information in stale
packets will not propogate, so they cant do much damage really. (just
unneccessary to queue and send such packets).
However, other commonly used protocols are not as robust. Mostly
those where a protocol is used to distribute routing information to
passive listeners, eg:
- IPv4 ICMP based router-discovery (IRDP)
- IPv6 Router-advertisements
In these cases, the queuing behaviour is potentially dangerous and
could disrupt connectivity by propogating no-longer-valid routing
a)You could do a move to another device at this point.
b) dumb app will continue sending
3)=>plug cable back in 2 minutes later, obsolete LSAs sent followed by
any new ones that may follow
Right. Except OSPF is robust enough against stale packets. Other
protocols are not.
With the patch, packets in #2 will be dropped.
As a matter of fact within those two minutes, if stopped, it is
probable the device watchdog timer will kick in and flush the DMA
but not the scheduler queues above it (which is where upto a 1000
stale packets could be sitting).
Right, and our argument it doesnt make sense to send those packets.
I've never heard of any UDP and/or raw application that expected a
kernel to queue packets if they could not be sent for lack of link or
other problem, and any which did are surely broken by definition? ;)
What is it that you dont like now?
Sorry, wires crossed re "new behaviour". The "new new" behaviour in
the patch as you describe would be perfect.
PS: Another issue, could we have kernel space IP fragmentation for
IP_HDRINCL sockets please? We currently have to implement
fragmentation ourselves, which seems silly given that kernel already
has this functionality.
Paul Jakma paul@xxxxxxxx paul@xxxxxxxxx Key ID: 64A2FF6A
The early bird who catches the worm works for someone who comes in late
and owns the worm farm.
-- Travis McGee