[Top] [All Lists]

Re: [IPV4/IPV6] Keep wmem accounting separate in ip*_push_pending_frames

To: herbert@xxxxxxxxxxxxxxxxxxx
Subject: Re: [IPV4/IPV6] Keep wmem accounting separate in ip*_push_pending_frames
From: "David S. Miller" <davem@xxxxxxxxxxxxx>
Date: Sun, 15 May 2005 15:24:41 -0700 (PDT)
Cc: johnpol@xxxxxxxxxxx, netdev@xxxxxxxxxxx
In-reply-to: <20050515114121.GA4830@xxxxxxxxxxxxxxxxxxx>
References: <E1DXE3h-0002jR-00@xxxxxxxxxxxxxxxxxxxxxxxx> <20050515104016.GA24344@xxxxxxxxxxxxxxxxxxx> <20050515114121.GA4830@xxxxxxxxxxxxxxxxxxx>
Sender: netdev-bounce@xxxxxxxxxxx
From: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [IPV4/IPV6] Keep wmem accounting separate in 
Date: Sun, 15 May 2005 21:41:21 +1000

> So let's go the other way and make this an invariant:
>       For any skb on a frag_list, skb->sk must be NULL.
> That is, the socket ownership always belongs to the head skb.
> It turns out that the implementation is actually pretty simple.

Yes, I was just about to inform you that when SKBs are
free'd up, only the head SKB's sk gets the destructor run
on it.

Please refer to this patch from July of 2004:

[IPV4]: Fix multicast socket hangs.

If a multicast packet gets looped back, the sending
socket can hang if a local read just sits and does
not empty its receive queue.

The problem is that when an SKB clone is freed up,
the destructor is only invoked for the head SKB when
there is a fraglist (which is created for fragmentation).

The solution is to account the fragment list SKB lengths
in the top-level head SKB, then it all works out.

Signed-off-by: David S. Miller <davem@xxxxxxxxxx>

--- 1/net/ipv4/ip_output.c      2005-05-15 15:23:19 -07:00
+++ 2/net/ipv4/ip_output.c      2005-05-15 15:23:19 -07:00
@@ -498,10 +498,6 @@
                            skb_headroom(frag) < hlen)
                            goto slow_path;
-                       /* Correct socket ownership. */
-                       if (frag->sk == NULL && skb->sk)
-                               goto slow_path;
                        /* Partially cloned skb? */
                        if (skb_shared(frag))
                                goto slow_path;
@@ -1113,12 +1109,10 @@
                tail_skb = &(tmp_skb->next);
                skb->len += tmp_skb->len;
                skb->data_len += tmp_skb->len;
-#if 0 /* Logically correct, but useless work, ip_fragment() will have to undo 
                skb->truesize += tmp_skb->truesize;
                tmp_skb->destructor = NULL;
                tmp_skb->sk = NULL;
        /* Unless user demanded real pmtu discovery (IP_PMTUDISC_DO), we allow

<Prev in Thread] Current Thread [Next in Thread>