netdev
[Top] [All Lists]

Re: serious netpoll bug w/NAPI

To: "David S. Miller" <davem@xxxxxxxxxxxxx>, Jeff Moyer <jmoyer@xxxxxxxxxx>
Subject: Re: serious netpoll bug w/NAPI
From: Matt Mackall <mpm@xxxxxxxxxxx>
Date: Wed, 9 Feb 2005 10:32:19 -0800
Cc: netdev@xxxxxxxxxxx
In-reply-to: <20050208201634.03074349.davem@davemloft.net>
References: <20050208201634.03074349.davem@davemloft.net>
Sender: netdev-bounce@xxxxxxxxxxx
User-agent: Mutt/1.5.6+20040907i
On Tue, Feb 08, 2005 at 08:16:34PM -0800, David S. Miller wrote:
> 
> Consider a NAPI device currently executing it's poll function,
> pushing SKBs into the networking stack.
> 
> Some of these will generate response packets etc.
> 
> If for some reason a printk() is generated by the packet processing
> and:
> 
> 1) the netconsole output device is the same as the NAPI device
>    processing packets
> 
> 2) netif_queue_stopped() is true because the tx queue is full
> 
> the netpoll code will recurse back into the driver's poll function.
> This is incredibly illegal and results in all kinds of driver state
> corruption.  ->poll() must execute only once at a time.

On closer inspection, there's a couple other related failure cases
with the new ->poll logic in netpoll. I'm afraid it looks like
CONFIG_NETPOLL will need to guard ->poll() with a per-device spinlock
on netpoll-enabled devices.

This will mean putting a pointer to struct netpoll in struct
net_device (which I should have done in the first place) and will take
a few patches to sort out.

-- 
Mathematics is the supreme nostalgia of our time.

<Prev in Thread] Current Thread [Next in Thread>