|To:||Andrew Morton <akpm@xxxxxxxx>|
|Subject:||Re: Fw: [Bugme-new] [Bug 4628] New: Test server hang while running rhr (network) test on RHEL4 with kernel 2.6.12-rc1-mm4|
|From:||Jian Jun He <hejianj@xxxxxxxxxx>|
|Date:||Fri, 27 May 2005 14:18:07 +0800|
|Cc:||anton@xxxxxxxxx, Dang En Ren <rende@xxxxxxxxxx>, ganesh.venkatesan@xxxxxxxxx, ganesh.venkatesan@xxxxxxxxx, herbert@xxxxxxxxxxxxxxxxxxx, jesse.brandeburg@xxxxxxxxx, jgarzik@xxxxxxxxx, Jia Sen Wang <wangjs@xxxxxxxxxx>, john.ronciak@xxxxxxxxx, Lei CDL Wang <cdlwangl@xxxxxxxxxx>, linuxppc64-dev@xxxxxxxxxxxxxxxxxxxxxxxxxx, netdev@xxxxxxxxxxx|
Jian Jun He <hejianj@xxxxxxxxxx> wrote:
> 2. Download rhr2-rhel4-1.0-14a.noarch.rpm from rhn.redhat.com and install
> it on
> the test machine.
> 3. Configure and run the rhr test via invoking redhat-ready.
This is the problematic bit.
- Please provide a full URL which can be used to obtain rhr.
rhn.redhat.com is subscription-based.
- Please describe the hardware setup - surely the test requires at least
two machines. How are they configured?
- Provide an exact transcript of the commands which are to be used. Is
with no arguments?
All that begin said, we already have a quite specific diagnosis via code
inspection, from Herbert:
Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> wrote:
> Andrew Morton <akpm@xxxxxxxx> wrote:
> > Might be a bug in the e100 driver, might not be.
> > I assume this is the
> > BUG_ON(skb->list != NULL);
> It certainly is a bug in e100.
> e100_tx_timeout -> e100_down -> e100_rx_clean_list
> is racing against
> e100_poll -> e100_rx_clean -> e100_rx_indicate
> e100_rx_clean/e100_rx_indicate takes an skb off the RX ring and
> while it's being processed e100_rx_clean_list comes along and
> frees it.
> From a quick check similar problems may exist in other drivers that
> have lockless ->poll() functions with RX rings.
Do the e100 maintainers agree with this diagnosis? If so then more testing
isn't required at this stage - the next step is to fix the above bug, no?
|<Prev in Thread]||Current Thread||[Next in Thread>|
|Previous by Date:||Re: patch tulip-natsemi-dp83840a-phy-fix.patch added to -mm tree, Francois Romieu|
|Next by Date:||Re: [PATCH 2.6] fix deadlock with ip_queue and tcp local input path, Harald Welte|
|Previous by Thread:|
|Next by Thread:|
|Indexes:||[Date] [Thread] [Top] [All Lists]|