xfs
[Top] [All Lists]

Re: 3.1-rc4: spectacular kernel errors / filesystem crash

To: Justin Piszcz <jpiszcz@xxxxxxxxxxxxxxx>
Subject: Re: 3.1-rc4: spectacular kernel errors / filesystem crash
From: Jon Mason <mason@xxxxxxxx>
Date: Tue, 13 Sep 2011 10:35:29 -0500
Cc: Eric Dumazet <eric.dumazet@xxxxxxxxx>, Jesse Brandeburg <jesse.brandeburg@xxxxxxxxx>, Alan Piszcz <ap@xxxxxxxxxxxxx>, NetDEV list <netdev@xxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
In-reply-to: <alpine.DEB.2.02.1109130722210.21380@xxxxxxxxxxxxxxxx>
References: <alpine.DEB.2.02.1109110511250.8626@xxxxxxxxxxxxxxxx> <CAEuXFEzs1f7n5taYzupux3AtKmRcY4P0m7yjkUQA8aLyS8eujw@xxxxxxxxxxxxxx> <1315886706.2556.11.camel@edumazet-laptop> <alpine.DEB.2.02.1109130722210.21380@xxxxxxxxxxxxxxxx>
On Tue, Sep 13, 2011 at 9:54 AM, Justin Piszcz <jpiszcz@xxxxxxxxxxxxxxx> wrote:
>
>
> On Tue, 13 Sep 2011, Eric Dumazet wrote:
>
>> Please Justin make sure you pulled commit
>> commit ed2888e906b56769b4ffabb9c577190438aa68b8
>> Author: Jon Mason <mason@xxxxxxxx>
>> Date:   Thu Sep 8 16:41:18 2011 -0500
>>
>>   PCI: Remove MRRS modification from MPS setting code
>>
>>   Modifying the Maximum Read Request Size to 0 (value of 128Bytes) has
>>   massive negative ramifications on some devices.  Without knowing which
>>   devices have this issue, do not modify from the default value when
>>   walking the PCI-E bus in pcie_bus_safe mode.  Also, make pcie_bus_safe
>>   the default procedure.
>>
>>   Tested-by: Sven Schnelle <svens@xxxxxxxxxxxxxx>
>>   Tested-by: Simon Kirby <sim@xxxxxxxxxx>
>>   Tested-by: Stephen M. Cameron <scameron@xxxxxxxxxxxxxxxxxx>
>>   Reported-and-tested-by: Eric Dumazet <eric.dumazet@xxxxxxxxx>
>>   Reported-and-tested-by: Niels Ole Salscheider
>> <niels_ole@salscheider-online.
>>   References: https://bugzilla.kernel.org/show_bug.cgi?id=42162
>>   Signed-off-by: Jon Mason <mason@xxxxxxxx>
>>   Acked-by: Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx>
>>   Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
>
> Hello,
>
> I found this commit here:
> http://permalink.gmane.org/gmane.linux.kernel.pci/11700

This is an early version of the patch.  This is the patch that you want:
https://github.com/torvalds/linux/commit/ed2888e906b56769b4ffabb9c577190438aa68b8

It appears that this patch didn't make it to lkml or linux-pci list
due to kernel.org DNS being down when it was sent.

Thanks,
Jon

>
> Applied:
> # patch -p1 < ../ed2888e906b56769b4ffabb9c577190438aa68b8.txt patching file
> drivers/pci/probe.c
>
> I will update this thread if the problem recurs, can someone also please
> advise
> which DEBUG options I should have enabled to catch further SLAB/RCU issues?
>
> So far, I have the following enabled:
>
> CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
> CONFIG_HAVE_DMA_API_DEBUG=y
> CONFIG_X86_DEBUGCTLMSR=y
> CONFIG_DEBUG_FS=y
> CONFIG_DEBUG_KERNEL=y
> CONFIG_DEBUG_SLAB=y
> CONFIG_DEBUG_SLAB_LEAK=y
> CONFIG_DEBUG_KMEMLEAK=y
> CONFIG_DEBUG_STACK_USAGE=y
> CONFIG_DEBUG_BUGVERBOSE=y
> CONFIG_DEBUG_INFO=y
> CONFIG_DEBUG_VM=y
> CONFIG_DEBUG_VIRTUAL=y
> CONFIG_DEBUG_MEMORY_INIT=y
> CONFIG_DEBUG_PER_CPU_MAPS=y
> CONFIG_DEBUG_PAGEALLOC=y
> CONFIG_DEBUG_STACKOVERFLOW=y
> CONFIG_DEBUG_RODATA=y
>
> Thanks,
>
> Justin.
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

<Prev in Thread] Current Thread [Next in Thread>