xfs
[Top] [All Lists]

Re: generic/320 triggers "list_add attempted on force-poisoned entry" wa

To: Eryu Guan <eguan@xxxxxxxxxx>
Subject: Re: generic/320 triggers "list_add attempted on force-poisoned entry" warning on XFS
From: Dan Williams <dan.j.williams@xxxxxxxxx>
Date: Tue, 1 Mar 2016 08:27:58 -0800
Cc: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>, XFS Developers <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc; bh=zGf/bO+OZep6fANJEBhq9QM1PMIEIOFgHoQQd3CD+8w=; b=ioJTXgCeMoXWCfXXkp/6HlvXxlzImtC8Ja39g8xhjz99tkCJkmPpRtNR+Vx/uWnwCn kU9PcsN+ape2ePMtoFiXOnXRcb9CDLlQxZDe4Fu/+ASg4VlJOoXpalzyRnQxpkdXq/xW fNXgNUeJpWAllf6AuA+0tPCu6xdPcZ1IGo8wbMyPr06ebUnjMU/gjaDr+aZTEVsf9TCE UyfgOts9Bar9ysNM/Byuttuo62X2rqotmpwJVDaPfibXz7iq8dgFOftaRn2xcISicpDq /y1vvTYoJmoucyF8/eD+2cPX8g2ys1IYz6pPGKXXd1Z7j4fmIMeNa3pm48bAhMXJXS2H 4TQg==
In-reply-to: <20160301080057.GL11419@xxxxxxxxxxxxxxxxxxxxxxxx>
References: <20160227130256.GJ11419@xxxxxxxxxxxxxxxxxxxxxxxx> <CAPcyv4gZKzUsch2RQm-QQWuUHtmtQerz2fstCwbwaMjyzYvwrA@xxxxxxxxxxxxxx> <20160228053158.GK11419@xxxxxxxxxxxxxxxxxxxxxxxx> <CAPcyv4jNhCkStvwtS54FjJVWAfvMv-gEDjHWq3V7_YTOmk+W2g@xxxxxxxxxxxxxx> <20160301080057.GL11419@xxxxxxxxxxxxxxxxxxxxxxxx>
On Tue, Mar 1, 2016 at 12:00 AM, Eryu Guan <eguan@xxxxxxxxxx> wrote:
> On Mon, Feb 29, 2016 at 10:22:06AM -0800, Dan Williams wrote:
>> On Sat, Feb 27, 2016 at 9:31 PM, Eryu Guan <eguan@xxxxxxxxxx> wrote:
>> > On Sat, Feb 27, 2016 at 12:10:51PM -0800, Dan Williams wrote:
>> >> On Sat, Feb 27, 2016 at 5:02 AM, Eryu Guan <eguan@xxxxxxxxxx> wrote:
>> >> > Hi,
>> >> >
>> >> > Starting from 4.5-rc1 kernel, I sometimes see generic/320 triggers
>> >> > "list_add attempted on force-poisoned entry" warnings on XFS, test hosts
>> >> > are arm64/ppc64/ppc64le, haven't seen it on x86_64 hosts.
>> >>
>> >> Hmm, this triggers when a list_head has ->next or ->prev pointing at
>> >> the address of force_poison which is only defined in lib/list_debug.c.
>> >> The only call site that uses list_force_poison() is in
>> >> devm_memremap_pages().  That currently depends on CONFIG_ZONE_DEVICE
>> >> which in turn depends on X86_64.
>> >>
>> >> So, this appears to be a false positive and the address of
>> >> force_poison is somehow ending up on the stack by accident as that is
>> >> the random value being passed in from __down_common:
>> >>
>> >>     struct semaphore_waiter waiter;
>> >>
>> >>     list_add_tail(&waiter.list, &sem->wait_list);
>> >>
>> >> So, I think we need a more unique poison value that should never
>> >> appear on the stack:
>> >
>> > Unfortunately I can still see the warning after applying this test patch.
>> >
>> > Then I added debug code to print the pointer value and re-ran the test.
>> > All five failures printed the same pointer value, failed in the same
>> > pattern:
>> >
>> > list_add attempted on force-poisoned entry(0000000000000500), new->next = 
>> > c00000000136bc00, new->prev = 0000000000000500
>> >
>>
>> I think this means that no matter what we do the stack will pick up
>> these poison values unless the list_head is explicitly initialized.
>> Something like the following:
>
> Umm, it's still reproducible... but seems harder than before, it took me
> 200+ iterations to hit (less than 10 iterations in previous runs)

Similar fix, just in rwsem_down_read_failed() this time:

diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index a4d4de05b2d1..68678a20da52 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -214,8 +214,10 @@ __visible
struct rw_semaphore __sched *rwsem_down_read_failed(struct rw_semaphore *sem)
{
       long count, adjustment = -RWSEM_ACTIVE_READ_BIAS;
-       struct rwsem_waiter waiter;
       struct task_struct *tsk = current;
+       struct rwsem_waiter waiter = {
+               .list = LIST_HEAD_INIT(waiter.list),
+       };

       /* set up my own style of waitqueue */
       waiter.task = tsk;

<Prev in Thread] Current Thread [Next in Thread>