xfs
[Top] [All Lists]

Re: 3.1-rc4: spectacular kernel errors / filesystem crash

To: Eric Dumazet <eric.dumazet@xxxxxxxxx>
Subject: Re: 3.1-rc4: spectacular kernel errors / filesystem crash
From: Justin Piszcz <jpiszcz@xxxxxxxxxxxxxxx>
Date: Tue, 13 Sep 2011 10:54:13 -0400 (EDT)
Cc: Jesse Brandeburg <jesse.brandeburg@xxxxxxxxx>, Alan Piszcz <ap@xxxxxxxxxxxxx>, NetDEV list <netdev@xxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
In-reply-to: <1315886706.2556.11.camel@edumazet-laptop>
References: <alpine.DEB.2.02.1109110511250.8626@xxxxxxxxxxxxxxxx> <CAEuXFEzs1f7n5taYzupux3AtKmRcY4P0m7yjkUQA8aLyS8eujw@xxxxxxxxxxxxxx> <1315886706.2556.11.camel@edumazet-laptop>
User-agent: Alpine 2.02 (DEB 1266 2009-07-14)


On Tue, 13 Sep 2011, Eric Dumazet wrote:

Please Justin make sure you pulled commit
commit ed2888e906b56769b4ffabb9c577190438aa68b8
Author: Jon Mason <mason@xxxxxxxx>
Date:   Thu Sep 8 16:41:18 2011 -0500

   PCI: Remove MRRS modification from MPS setting code

   Modifying the Maximum Read Request Size to 0 (value of 128Bytes) has
   massive negative ramifications on some devices.  Without knowing which
   devices have this issue, do not modify from the default value when
   walking the PCI-E bus in pcie_bus_safe mode.  Also, make pcie_bus_safe
   the default procedure.

   Tested-by: Sven Schnelle <svens@xxxxxxxxxxxxxx>
   Tested-by: Simon Kirby <sim@xxxxxxxxxx>
   Tested-by: Stephen M. Cameron <scameron@xxxxxxxxxxxxxxxxxx>
   Reported-and-tested-by: Eric Dumazet <eric.dumazet@xxxxxxxxx>
   Reported-and-tested-by: Niels Ole Salscheider <niels_ole@salscheider-online.
   References: https://bugzilla.kernel.org/show_bug.cgi?id=42162
   Signed-off-by: Jon Mason <mason@xxxxxxxx>
   Acked-by: Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx>
   Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>

Hello,

I found this commit here:
http://permalink.gmane.org/gmane.linux.kernel.pci/11700

Applied:
# patch -p1 < ../ed2888e906b56769b4ffabb9c577190438aa68b8.txt patching file drivers/pci/probe.c

I will update this thread if the problem recurs, can someone also please advise
which DEBUG options I should have enabled to catch further SLAB/RCU issues?

So far, I have the following enabled:

CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_SLAB_LEAK=y
CONFIG_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_STACK_USAGE=y
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_VM=y
CONFIG_DEBUG_VIRTUAL=y
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_DEBUG_PER_CPU_MAPS=y
CONFIG_DEBUG_PAGEALLOC=y
CONFIG_DEBUG_STACKOVERFLOW=y
CONFIG_DEBUG_RODATA=y

Thanks,

Justin.

<Prev in Thread] Current Thread [Next in Thread>