netdev
[Top] [All Lists]

Re: [PATCH] Deadlock in af_packet/packet_rcv

To: Olaf Kirch <okir@xxxxxxx>
Subject: Re: [PATCH] Deadlock in af_packet/packet_rcv
From: Tommy Christensen <tommy.christensen@xxxxxxxxx>
Date: Tue, 30 Nov 2004 12:31:50 +0100
Cc: netdev@xxxxxxxxxxx
In-reply-to: <20041130110110.GD16970@suse.de>
References: <20041125205503.GA18083@suse.de> <41AC3E2F.2030003@tpack.net> <20041130110110.GD16970@suse.de>
Sender: netdev-bounce@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040803
Olaf Kirch wrote:
On Tue, Nov 30, 2004 at 10:32:31AM +0100, Tommy Christensen wrote:

An interrupt handler shouldn't call dev_queue_xmit() directly. If
this indeed happens, it needs to be fixed. Which handler is this?


The call path according to KDB goes like this:

application does sendmsg()
udp_push_pending_frames ip_push_pending_frames ip_output dev_queue_xmit dev_queue_xmit_nit calls ptype->func(skb2, skb->dev, ptype),
where func=packet_rcv packet_rcv (and this runs with BHs enabled)
take the &sk->sk_receive_queue spinlock *** timer interrupt
net_tx_action
take the dev->queue_lock spin lock
qdisc_run
qdisc_restart
dev_queue_xmit_nit
as above
packet_rcv
blocks on the &sk->sk_receive_queue spinlock


Before lockless-loopback this never triggered because we did a
spin_lock_bh(&dev->xmit_lock) around the call to dev_queue_xmit_nit.

Olaf

Ahh, back-traces are *so* nice to have.

I still don't agree with the conclusion, though. The spin_lock_bh()
is changed to a local_bh_disable() and an optional spin_lock().
That should not lead to what you are seeing!

I think perhaps your 'BH disabled count' has been corrupted.

There's a fix for that in 2.6.10-rc2.

-Tommy

<Prev in Thread] Current Thread [Next in Thread>