netdev
[Top] [All Lists]

Re: serious netpoll bug w/NAPI

To: Matt Mackall <mpm@xxxxxxxxxxx>
Subject: Re: serious netpoll bug w/NAPI
From: Jeff Moyer <jmoyer@xxxxxxxxxx>
Date: Wed, 16 Feb 2005 14:26:37 -0500
Cc: "David S. Miller" <davem@xxxxxxxxxxxxx>, netdev@xxxxxxxxxxx
In-reply-to: <20050216050722.GC3358@waste.org>
References: <20050208201634.03074349.davem@davemloft.net> <20050209183219.GA2366@waste.org> <20050209164658.409f8950.davem@davemloft.net> <20050210011104.GF2366@waste.org> <16914.31886.665975.522710@segfault.boston.redhat.com> <20050216050722.GC3358@waste.org>
Reply-to: jmoyer@xxxxxxxxxx
Sender: netdev-bounce@xxxxxxxxxxx
==> Regarding Re: serious netpoll bug w/NAPI; Matt Mackall <mpm@xxxxxxxxxxx> 
adds:

mpm> On Tue, Feb 15, 2005 at 05:49:50PM -0500, Jeff Moyer wrote:
>> ==> Regarding Re: serious netpoll bug w/NAPI; Matt Mackall <mpm@xxxxxxxxxxx> 
>> adds:
>> 
>> Sorry, Matt, I'm just now getting to this.
>> 
mpm> On Wed, Feb 09, 2005 at 04:46:58PM -0800, David S. Miller wrote:
>> >> On Wed, 9 Feb 2005 10:32:19 -0800 Matt Mackall <mpm@xxxxxxxxxxx> wrote:
>> >> 
>> >> > On closer inspection, there's a couple other related failure cases >
>> >> with the new ->poll logic in netpoll. I'm afraid it looks like >
>> >> CONFIG_NETPOLL will need to guard ->poll() with a per-device spinlock >
>> >> on netpoll-enabled devices.
>> >> > 
>> >> > This will mean putting a pointer to struct netpoll in struct >
>> >> net_device (which I should have done in the first place) and will take >
>> >> a few patches to sort out.
>> >> 
>> >> Will this ->poll() guarding lock be acquired only in the netpoll code or
>> >> system-wide?  If the latter, this introduced an incredible performance
>> >> regression for devices using the LLTX locking scheme (ie. the most
>> >> important high-performance gigabit drivers in the tree use this).
>> 
mpm> The lock will only be taken in net_rx_action iff netpoll is enabled
mpm> for the given device. Lock contention should be basically nil.
>> 
mpm> Here's my current patch (on top of -mm), which I'm struggling to find
mpm> an appropriate test box for (my dual box with NAPI got pressed into
mpm> service as a web server a couple weeks ago). I've attached the other
mpm> two patches it relies on as well.
>> 
mpm> --------------
>> 
mpm> Introduce a per-client poll lock and flag. The lock assures we never
mpm> have more than one caller in dev->poll(). The flag provides recursion
mpm> avoidance on UP where the lock disappears.
>> 
>> ,----
>> |  /*
>> | - * Check whether delayed processing was scheduled for our current CPU,
>> | - * and then manually invoke NAPI polling to pump data off the card.
>> | + * Check whether delayed processing was scheduled for our NIC. If so,
>> | + * we attempt to grab the poll lock and use ->poll() to pump the card.
>> | + * If this fails, either we've recursed in ->poll() or it's already
>> | + * running on another CPU.
>> | + *
>> | + * Note: we don't mask interrupts with this lock because we're using
>> | + * trylock here and interrupts are already disabled in the softirq
>> | + * case. Further, we test the poll_flag to avoid recursion on UP
>> | + * systems where the lock doesn't exist.
>> |   *
>> |   * In cases where there is bi-directional communications, reading only
>> |   * one message at a time can lead to packets being dropped by the
>> | @@ -74,13 +80,9 @@
>> |  static void poll_napi(struct netpoll *np)
>> |  {
>> |    int budget = 16;
>> | -  unsigned long flags;
>> | -  struct softnet_data *queue;
>> |  
>> | -  spin_lock_irqsave(&netpoll_poll_lock, flags);
>> | -  queue = &__get_cpu_var(softnet_data);
>> |    if (test_bit(__LINK_STATE_RX_SCHED, &np->dev->state) &&
>> | -      !list_empty(&queue->poll_list)) {
>> | +      !np->poll_flag && spin_trylock(&np->poll_lock)) {
>> |            np->rx_flags |= NETPOLL_RX_DROP;
>> |            atomic_inc(&trapped);
>> |  
>> | @@ -88,8 +90,8 @@
>> |  
>> |            atomic_dec(&trapped);
>> |            np->rx_flags &= ~NETPOLL_RX_DROP;
>> | +          spin_unlock(&np->poll_lock);
>> |    }
>> | -  spin_unlock_irqrestore(&netpoll_poll_lock, flags);
>> |  }
>> 
>> Okay, I've only taken a quick glance at this, but I don't quite understand
>> why it's safe to take out the check for the per-cpu queue.  Realize that an
>> interrupt may have been serviced on another processor, and a NAPI poll may
>> have been scheduled there.

mpm> Because dev->np->poll_lock now serializes all access to ->poll (when
mpm> netpoll is enabled on said device).

Ahh, yes.  This does look like the right approach.  I'll see if I can
reproduce and test this here.

-Jeff

<Prev in Thread] Current Thread [Next in Thread>