Hi,
as we have discussed some time ago, the linux network interface
does not preserve the reliable datalink service that upper layers
might expect from a lapb device. As a result of the discussion,
I've prepared the following patch, (against test10). As it changes a
documented kernel interface, it would be nice to include it before
2.4.0-final. The patch does the following:
- It documents the possible problems (packet loss due to backlog congestion
and packet re-ordering in SMP kernels) in x25-iface.txt and suggests
work-arounds.
- It changes the kernel interface to the lapb module. Its data_indication()
method now honors an integer return value. If the return value is
NET_RX_DROP, the lapb protocol will not confirm the packet (this will
cause the peer to retransmit the lost packet later).
- The drivers which use the lapb module are modified as required by the
new lapb return semantics.
I've done several tests with the lapbether driver by simulating
netif_rx() backlog congestion and verified that it works. For the other
drivers, I checked that they compile o.k. (anc cc'ed to the maintainers)
Henner
diff -ur linux/Documentation/networking/lapb-module.txt
x25fix/Documentation/networking/lapb-module.txt
--- linux/Documentation/networking/lapb-module.txt Mon May 10 22:00:10 1999
+++ x25fix/Documentation/networking/lapb-module.txt Mon Nov 6 21:29:15 2000
@@ -2,6 +2,8 @@
Jonathan Naylor 29.12.96
+Changed (Henner Eisen, 2000-10-29): int return value for data_indication()
+
The LAPB module will be a separately compiled module for use by any parts of
the Linux operating system that require a LAPB service. This document
defines the interfaces to, and the services provided by this module. The
@@ -37,7 +39,7 @@
void (*connect_indication)(int token, int reason);
void (*disconnect_confirmation)(int token, int reason);
void (*disconnect_indication)(int token, int reason);
- void (*data_indication)(int token, struct sk_buff *skb);
+ int (*data_indication)(int token, struct sk_buff *skb);
void (*data_transmit)(int token, struct sk_buff *skb);
};
@@ -240,7 +242,7 @@
system.
-void (*data_indication)(void *token, struct sk_buff *skb);
+int (*data_indication)(void *token, struct sk_buff *skb);
This is called by the LAPB module when data has been received from the
remote system that should be passed onto the next layer in the protocol
@@ -248,6 +250,10 @@
module will not perform any more actions on it. The skb->data pointer will
be pointing to the first byte of data after the LAPB header.
+This method should return NET_RX_DROP (as defined in the header
+file include/linux/netdevice.h) if and only if the frame was dropped
+before it could be delivered to the upper layer.
+
void (*data_transmit)(void *token, struct sk_buff *skb);
diff -ur linux/Documentation/networking/x25-iface.txt
x25fix/Documentation/networking/x25-iface.txt
--- linux/Documentation/networking/x25-iface.txt Tue Apr 28 23:22:04 1998
+++ x25fix/Documentation/networking/x25-iface.txt Mon Nov 6 21:29:15 2000
@@ -62,3 +62,62 @@
First Byte = 0x03
LAPB parameters. To be defined.
+
+
+
+Possible Problems
+=================
+
+(Henner Eisen, 2000-10-28)
+
+The X.25 packet layer protocol depends on a reliable datalink service.
+The LAPB protocol provides such reliable service. But this reliability
+is not preserved by the Linux network device driver interface:
+
+- With Linux 2.4.x (and above) SMP kernels, packet ordering is not
+ preserved. Even if a device driver calls netif_rx(skb1) and later
+ netif_rx(skb2), skb2 might be delivered to the network layer
+ earlier that skb1.
+- Data passed upstream by means of netif_rx() might be dropped by the
+ kernel if the backlog queue is congested.
+
+The X.25 packet layer protocol will detect this and reset the virtual
+call in question. But many upper layer protocols are not designed to
+handle such N-Reset events gracefully. And frequent N-Reset events
+will always degrade performance.
+
+Thus, driver authors should make netif_rx() as reliable as possible:
+
+SMP re-ordering will not occur if the driver's interrupt handler is
+always executed on the same CPU. Thus,
+
+- Driver authors should use irq affinity for the interrupt handler.
+
+The probability of packet loss due to backlog congestion can be
+reduced by the following measures or a combination thereof:
+
+(1) Drivers for kernel versions 2.4.x and above should always check the
+ return value of netif_rx(). If it returns NET_RX_DROP, the
+ driver's LAPB protocol must not confirm reception of the frame
+ to the peer.
+ This will reliably suppress packet loss. The LAPB protocol will
+ automatically cause the peer to re-transmit the dropped packet
+ later.
+ The lapb module interface was modified to support this. Its
+ data_indication() method should now transparently pass the
+ netif_rx() return value to the (lapb mopdule) caller.
+(2) Drivers for kernel versions 2.2.x should always check the global
+ variable netdev_dropping when a new frame is received. The driver
+ should only call netif_rx() if netdev_dropping is zero. Otherwise
+ the driver should not confirm delivery of the frame and drop it.
+ Alternatively, the driver can queue the frame internally and call
+ netif_rx() later when netif_dropping is 0 again. In that case, delivery
+ confirmation should also be deferred such that the internal queue
+ cannot grow to much.
+ This will not reliably avoid packet loss, but the probability
+ of packet loss in netif_rx() path will be significantly reduced.
+(3) Additionally, driver authors might consider to support
+ CONFIG_NET_HW_FLOWCONTROL. This allows the driver to be woken up
+ when a previously congested backlog queue becomes empty again.
+ The driver could uses this for flow-controlling the peer by means
+ of the LAPB protocol's flow-control service.
diff -ur linux/drivers/net/wan/comx-proto-lapb.c
x25fix/drivers/net/wan/comx-proto-lapb.c
--- linux/drivers/net/wan/comx-proto-lapb.c Tue Jul 18 17:06:39 2000
+++ x25fix/drivers/net/wan/comx-proto-lapb.c Mon Nov 6 21:29:15 2000
@@ -15,6 +15,9 @@
* Version 0.80 (99/06/14):
* - cleaned up the source code a bit
* - ported back to kernel, now works as non-module
+ *
+ * Changed (00/10/29, Henner Eisen):
+ * - comx_rx() / comxlapb_data_indication() return status.
*
*/
@@ -359,7 +362,7 @@
comx_status(ch->dev, ch->line_status);
}
-static void comxlapb_data_indication(void *token, struct sk_buff *skb)
+static int comxlapb_data_indication(void *token, struct sk_buff *skb)
{
struct comx_channel *ch = token;
@@ -373,7 +376,7 @@
skb->dev = ch->dev;
skb->mac.raw = skb->data;
- comx_rx(ch->dev, skb);
+ return comx_rx(ch->dev, skb);
}
static void comxlapb_data_transmit(void *token, struct sk_buff *skb)
diff -ur linux/drivers/net/wan/comx.c x25fix/drivers/net/wan/comx.c
--- linux/drivers/net/wan/comx.c Wed Oct 4 19:21:53 2000
+++ x25fix/drivers/net/wan/comx.c Mon Nov 6 21:29:15 2000
@@ -46,6 +46,9 @@
* Version 0.85 (00/08/15):
* - resource release on failure in comx_mkdir
* - fix return value on failure at comx_write_proc
+ *
+ * Changed (00/10/29, Henner Eisen):
+ * - comx_rx() / comxlapb_data_indication() return status.
*/
#define VERSION "0.85"
diff -ur linux/drivers/net/wan/lapbether.c x25fix/drivers/net/wan/lapbether.c
--- linux/drivers/net/wan/lapbether.c Mon May 22 20:51:40 2000
+++ x25fix/drivers/net/wan/lapbether.c Mon Nov 6 21:29:15 2000
@@ -16,6 +16,7 @@
*
* History
* LAPBETH 001 Jonathan Naylor Cloned from bpqether.c
+ * 2000-10-29 Henner Eisen lapb_data_indication() return status.
*/
#include <linux/errno.h>
@@ -185,7 +186,7 @@
return 0;
}
-static void lapbeth_data_indication(void *token, struct sk_buff *skb)
+static int lapbeth_data_indication(void *token, struct sk_buff *skb)
{
struct lapbethdev *lapbeth = (struct lapbethdev *)token;
unsigned char *ptr;
@@ -198,7 +199,7 @@
skb->mac.raw = skb->data;
skb->pkt_type = PACKET_HOST;
- netif_rx(skb);
+ return netif_rx(skb);
}
/*
diff -ur linux/drivers/net/wan/x25_asy.c x25fix/drivers/net/wan/x25_asy.c
--- linux/drivers/net/wan/x25_asy.c Mon May 22 20:51:40 2000
+++ x25fix/drivers/net/wan/x25_asy.c Mon Nov 6 21:29:15 2000
@@ -9,6 +9,9 @@
* recommendations. Its primarily for testing purposes. If you wanted
* to do CCITT then in theory all you need is to nick the HDLC async
* checksum routines from ppp.c
+ * Changes:
+ *
+ * 2000-10-29 Henner Eisen lapb_data_indication() return status.
*/
#include <linux/module.h>
@@ -390,9 +393,9 @@
* at the net layer.
*/
-static void x25_asy_data_indication(void *token, struct sk_buff *skb)
+static int x25_asy_data_indication(void *token, struct sk_buff *skb)
{
- netif_rx(skb);
+ return netif_rx(skb);
}
/*
diff -ur linux/include/linux/lapb.h x25fix/include/linux/lapb.h
--- linux/include/linux/lapb.h Sun Jan 19 14:47:26 1997
+++ x25fix/include/linux/lapb.h Mon Nov 6 21:29:15 2000
@@ -28,7 +28,7 @@
void (*connect_indication)(void *token, int reason);
void (*disconnect_confirmation)(void *token, int reason);
void (*disconnect_indication)(void *token, int reason);
- void (*data_indication)(void *token, struct sk_buff *skb);
+ int (*data_indication)(void *token, struct sk_buff *skb);
void (*data_transmit)(void *token, struct sk_buff *skb);
};
diff -ur linux/net/lapb/lapb_iface.c x25fix/net/lapb/lapb_iface.c
--- linux/net/lapb/lapb_iface.c Mon Nov 6 21:00:04 2000
+++ x25fix/net/lapb/lapb_iface.c Wed Nov 8 17:02:25 2000
@@ -12,6 +12,7 @@
* History
* LAPB 001 Jonathan Naylor Started Coding
* LAPB 002 Jonathan Naylor New timer architecture.
+ * 2000-10-29 Henner Eisen lapb_data_indication() return status.
*/
#include <linux/config.h>
@@ -370,14 +371,11 @@
int lapb_data_indication(lapb_cb *lapb, struct sk_buff *skb)
{
- int used = 0;
-
if (lapb->callbacks.data_indication != NULL) {
- (lapb->callbacks.data_indication)(lapb->token, skb);
- used = 1;
+ return (lapb->callbacks.data_indication)(lapb->token, skb);
}
-
- return used;
+ kfree_skb(skb);
+ return NET_RX_CN_HIGH; /* For now; must be != NET_RX_DROP */
}
int lapb_data_transmit(lapb_cb *lapb, struct sk_buff *skb)
diff -ur linux/net/lapb/lapb_in.c x25fix/net/lapb/lapb_in.c
--- linux/net/lapb/lapb_in.c Tue Feb 10 22:07:49 1998
+++ x25fix/net/lapb/lapb_in.c Mon Nov 6 21:29:15 2000
@@ -12,6 +12,7 @@
* History
* LAPB 001 Jonathan Naulor Started Coding
* LAPB 002 Jonathan Naylor New timer architecture.
+ * 2000-10-29 Henner Eisen lapb_data_indication() return status.
*/
#include <linux/config.h>
@@ -464,8 +465,21 @@
lapb_check_iframes_acked(lapb, frame->nr);
}
if (frame->ns == lapb->vr) {
+ int cn;
+ cn = lapb_data_indication(lapb, skb);
+ queued = 1;
+ /*
+ * If upper layer has dropped the frame, we
+ * basically ignore any further protocol
+ * processing. This will cause the peer
+ * to re-transmit the frame later like
+ * a frame lost on the wire.
+ */
+ if(cn == NET_RX_DROP){
+ printk(KERN_DEBUG "LAPB: rx
congestion\n");
+ break;
+ }
lapb->vr = (lapb->vr + 1) % modulus;
- queued = lapb_data_indication(lapb, skb);
lapb->condition &= ~LAPB_REJECT_CONDITION;
if (frame->pf) {
lapb_enquiry_response(lapb);
|