[Top] [All Lists]

Re: Lockup with 2.6.9-ac15 related to netconsole

To: Patrick McHardy <kaber@xxxxxxxxx>
Subject: Re: Lockup with 2.6.9-ac15 related to netconsole
From: Francois Romieu <romieu@xxxxxxxxxxxxx>
Date: Wed, 22 Dec 2004 13:39:41 +0100
Cc: Matt Mackall <mpm@xxxxxxxxxxx>, Mark Broadbent <markb@xxxxxxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, netdev@xxxxxxxxxxx
In-reply-to: <>
References: <> <> <> <> <> <> <> <> <> <>
Sender: netdev-bounce@xxxxxxxxxxx
User-agent: Mutt/1.4.1i
Patrick McHardy <kaber@xxxxxxxxx> :
> at least the queued messages ordered. But you need to grab
> dev->queue_lock, otherwise you risk corrupting qdisc internal data.
> You should probably also deal with the noqueue-qdisc, which doesn't
> have an enqueue function. So it should look something like this:

If I am not mistaken, a failure on spin_trylock + the test on
xmit_lock_owner imply that it is safe to directly handle the
queue. It means that qdisc_run() has been interrupted on the
current cpu and the other paths seem fine as well. Counter-example
is welcome (no joke).

Of course the patch is completely ugly and violates any layering
principle one could think of. It was not submitted for inclusion :o)

> while (!spin_trylock(&np->dev->xmit_lock)) {
>       if (np->dev->xmit_lock_owner == smp_processor_id()) {
>               struct Qdisc *q;
>               rcu_read_lock();
>               q = rcu_dereference(dev->qdisc);
>               if (q->enqueue) {
>                       spin_lock(&dev->queue_lock);

I'd expect it to deadlock if dev_queue_xmit -> qdisc_run is interrupted
on the current cpu and a printk is issued as dev->queue_lock will have
been taken elsewhere.


<Prev in Thread] Current Thread [Next in Thread>