netdev
[Top] [All Lists]

Re: [PATCH] Deadlock in af_packet/packet_rcv

To: Olaf Kirch <okir@xxxxxxx>
Subject: Re: [PATCH] Deadlock in af_packet/packet_rcv
From: Tommy Christensen <tommy.christensen@xxxxxxxxx>
Date: Tue, 30 Nov 2004 12:31:50 +0100
Cc: netdev@xxxxxxxxxxx
In-reply-to: <20041130110110.GD16970@xxxxxxx>
References: <20041125205503.GA18083@xxxxxxx> <41AC3E2F.2030003@xxxxxxxxx> <20041130110110.GD16970@xxxxxxx>
Sender: netdev-bounce@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040803
Olaf Kirch wrote:
On Tue, Nov 30, 2004 at 10:32:31AM +0100, Tommy Christensen wrote:

An interrupt handler shouldn't call dev_queue_xmit() directly. If
this indeed happens, it needs to be fixed. Which handler is this?


The call path according to KDB goes like this:

        application does sendmsg()
udp_push_pending_frames ip_push_pending_frames ip_output dev_queue_xmit dev_queue_xmit_nit calls ptype->func(skb2, skb->dev, ptype), where func=packet_rcv packet_rcv (and this runs with BHs enabled) take the &sk->sk_receive_queue spinlock *** timer interrupt
        net_tx_action
                take the dev->queue_lock spin lock
        qdisc_run
        qdisc_restart
        dev_queue_xmit_nit
                as above
        packet_rcv
                blocks on the &sk->sk_receive_queue spinlock

Before lockless-loopback this never triggered because we did a
spin_lock_bh(&dev->xmit_lock) around the call to dev_queue_xmit_nit.

Olaf

Ahh, back-traces are *so* nice to have.

I still don't agree with the conclusion, though. The spin_lock_bh()
is changed to a local_bh_disable() and an optional spin_lock().
That should not lead to what you are seeing!

I think perhaps your 'BH disabled count' has been corrupted.

There's a fix for that in 2.6.10-rc2.

-Tommy

<Prev in Thread] Current Thread [Next in Thread>