xfs
[Top] [All Lists]

Re: generic/320 triggers "list_add attempted on force-poisoned entry" wa

To: Eryu Guan <eguan@xxxxxxxxxx>
Subject: Re: generic/320 triggers "list_add attempted on force-poisoned entry" warning on XFS
From: Dan Williams <dan.j.williams@xxxxxxxxx>
Date: Mon, 29 Feb 2016 10:22:06 -0800
Cc: XFS Developers <xfs@xxxxxxxxxxx>, Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc; bh=VQz11pZIz+b63m2QNek1qMQKPReE83X5UPRvbLku9oI=; b=g+pQ+z+3Be8wta2R1elkG6wuSS+8gd8lZEMRRJd/eP+1RT7bfuTQ7M8Vz2X26YwCVw 1DNfidgAVtb/w67iUd6pqBkm+0OTlZyZekMz43ECeFUX//KqcuVN/CUdWhRG1mN0s5IS acwnN9+1PoKZWA/RikcgD/fG9yzhgM9PTY9/UF2AMQMZp9LKyHdDe7oc/kKUxuKFH+fR UBkEvCrevxU7HgohFSkBopp7UoMLQCXNGAlEiRZw5t3bB8I1D8d+PS3fcVh8/DUVkSj7 Lo8SsIgL8eqiC1t3T9zdN2+EEdq+HO64+/WhU9JZ05OtoxDRw7MG3zEc28euK8nbd3e3 ZMfA==
In-reply-to: <20160228053158.GK11419@xxxxxxxxxxxxxxxxxxxxxxxx>
References: <20160227130256.GJ11419@xxxxxxxxxxxxxxxxxxxxxxxx> <CAPcyv4gZKzUsch2RQm-QQWuUHtmtQerz2fstCwbwaMjyzYvwrA@xxxxxxxxxxxxxx> <20160228053158.GK11419@xxxxxxxxxxxxxxxxxxxxxxxx>
On Sat, Feb 27, 2016 at 9:31 PM, Eryu Guan <eguan@xxxxxxxxxx> wrote:
> On Sat, Feb 27, 2016 at 12:10:51PM -0800, Dan Williams wrote:
>> On Sat, Feb 27, 2016 at 5:02 AM, Eryu Guan <eguan@xxxxxxxxxx> wrote:
>> > Hi,
>> >
>> > Starting from 4.5-rc1 kernel, I sometimes see generic/320 triggers
>> > "list_add attempted on force-poisoned entry" warnings on XFS, test hosts
>> > are arm64/ppc64/ppc64le, haven't seen it on x86_64 hosts.
>>
>> Hmm, this triggers when a list_head has ->next or ->prev pointing at
>> the address of force_poison which is only defined in lib/list_debug.c.
>> The only call site that uses list_force_poison() is in
>> devm_memremap_pages().  That currently depends on CONFIG_ZONE_DEVICE
>> which in turn depends on X86_64.
>>
>> So, this appears to be a false positive and the address of
>> force_poison is somehow ending up on the stack by accident as that is
>> the random value being passed in from __down_common:
>>
>>     struct semaphore_waiter waiter;
>>
>>     list_add_tail(&waiter.list, &sem->wait_list);
>>
>> So, I think we need a more unique poison value that should never
>> appear on the stack:
>
> Unfortunately I can still see the warning after applying this test patch.
>
> Then I added debug code to print the pointer value and re-ran the test.
> All five failures printed the same pointer value, failed in the same
> pattern:
>
> list_add attempted on force-poisoned entry(0000000000000500), new->next = 
> c00000000136bc00, new->prev = 0000000000000500
>

I think this means that no matter what we do the stack will pick up
these poison values unless the list_head is explicitly initialized.
Something like the following:

diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c
index b8120abe594b..39929b4e6fbb 100644
--- a/kernel/locking/semaphore.c
+++ b/kernel/locking/semaphore.c
@@ -205,7 +205,9 @@ static inline int __sched __down_common(struct
semaphore *sem, long state,
                                                               long timeout)
{
       struct task_struct *task = current;
-       struct semaphore_waiter waiter;
+       struct semaphore_waiter waiter = {
+               .list = LIST_HEAD_INIT(waiter.list),
+       };

       list_add_tail(&waiter.list, &sem->wait_list);
       waiter.task = task;

<Prev in Thread] Current Thread [Next in Thread>