xfs
[Top] [All Lists]

Re: Rambling noise #1: generic/230 can trigger kernel debug lock detecto

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: Rambling noise #1: generic/230 can trigger kernel debug lock detector
From: "Michael L. Semon" <mlsemon35@xxxxxxxxx>
Date: Fri, 10 May 2013 15:07:19 -0400
Cc: "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=MIMTOOUzI8GTus9XMci7yqwyWUI3deQdBnPyp8f1V3o=; b=GBLyzQXaZYl6PJ8JHEV2Hlgk3WYb0yq4geSwKsWPfLIdB170kE4EvFwjcGbaD4VW5v 0nMgQFXcW/J7DzfJsGw2qcNM9nQxsMunEtYY07XOeH0XE7VDLlHmxWdILhd3RsqhJrtr QG4WkE0B9rdS56nKVHUFMyl2SWBjkVcWfyb/QLHK1sYm1zIOGwwvYHzo5Txpu96WQDzN AGAMHgOOr4hmXk6rn+5P4C403LaLgUrwwD23l67BcJyEE4WG8qz3NGIppNkUncN0v2m6 Reng0GpSjk+CErA7BPO1eNNpRYcyVpYY9dpvl1a3zHVYdQMa6HiSyqhM6knULd8s3pNF rfXw==
In-reply-to: <20130510021942.GP23072@dastard>
References: <518B08D9.1060906@xxxxxxxxx> <20130509031646.GN24635@dastard> <20130509072045.GO24635@dastard> <518C54AA.7070908@xxxxxxxxx> <20130510021942.GP23072@dastard>
On Thu, May 9, 2013 at 10:19 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Thu, May 09, 2013 at 10:00:10PM -0400, Michael L. Semon wrote:
>> On 05/09/2013 03:20 AM, Dave Chinner wrote:
>> >On Thu, May 09, 2013 at 01:16:46PM +1000, Dave Chinner wrote:
>> >>On Wed, May 08, 2013 at 10:24:25PM -0400, Michael L. Semon wrote:
>> >>No, there's definitely a bug there. Thanks for the report, Michael.
>> >>Try the patch below.
>> >
>> >Actaully, there's a bug in the error handling in that version - it
>> >fails to unlock the quotaoff lock properly on failure. The version
>> >below fixes that problem.
>> >
>> >Cheers,
>> >
>> >Dave.
>>
>> OK, I'll try this version as well.  The first version seemed to work
>> just fine.
>
> It should, the bug was in an error handling path you are unlikely to
> hit.

OK, this version looks good, too, maybe better.  The only lockdep that
I'm hitting consistently so far is caused by generic/249--a circular
dependency--but that's probably a separate issue.  The trace is on my
USB key, but the PC for this E-mail is Windows XP and can't read F2FS.
 Sorry about that.

>> xfs/012 13s ...[ 1851.323902]
>> [ 1851.325479] =================================
>> [ 1851.326551] [ INFO: inconsistent lock state ]
>> [ 1851.326551] 3.9.0+ #1 Not tainted
>> [ 1851.326551] ---------------------------------
>> [ 1851.326551] inconsistent {RECLAIM_FS-ON-R} -> {IN-RECLAIM_FS-W} usage.
>> [ 1851.326551] kswapd0/18 [HC0[0]:SC0[0]:HE1:SE1] takes:
>> [ 1851.326551]  (&(&ip->i_lock)->mr_lock){++++-+}, at: [<c11dcabf>]
>> xfs_ilock+0x10f/0x190
>> [ 1851.326551] {RECLAIM_FS-ON-R} state was registered at:
>> [ 1851.326551]   [<c105e10a>] mark_held_locks+0x8a/0xf0
>> [ 1851.326551]   [<c105e69c>] lockdep_trace_alloc+0x5c/0xa0
>> [ 1851.326551]   [<c109c52c>] __alloc_pages_nodemask+0x7c/0x670
>> [ 1851.326551]   [<c10bfd8e>] new_slab+0x6e/0x2a0
>> [ 1851.326551]   [<c14083a9>] __slab_alloc.isra.59.constprop.67+0x1d3/0x40a
>> [ 1851.326551]   [<c10c12cd>] __kmalloc+0x10d/0x180
>> [ 1851.326551]   [<c1199b56>] kmem_alloc+0x56/0xd0
>> [ 1851.326551]   [<c1199be1>] kmem_zalloc+0x11/0xd0
>> [ 1851.326551]   [<c11c666e>] xfs_dabuf_map.isra.2.constprop.5+0x22e/0x520
>
> Yup, needs a KM_NOFS allocation there because we come through
> here outside a transaction and so it doesn't get KM_NOFS implicitly
> in this case. There's been a couple of these reported in the past
> week or two - I need to do an audit and sweep them all up....
>
> Technically, though, this can't cause a deadlock on the inode we
> hold a lock on here because it's a directory inode, not a regular
> file and so it will never be seen in the reclaim data writeback path
> nor on the inode LRU when the shrinker runs. So most likely it is a
> false positive...

Thanks for looking at it.  There are going to be plenty of false
positives out there.  Is there a pecking order of what works best?  As
in...

* IRQ (IRQs-off?) checking: worth reporting...?
* sleep inside atomic sections: fascinating, but almost anything can trigger it
* multiple-CPU deadlock detection: can only speculate on a uniprocessor system
* circular dependency checking: YMMV
* reclaim-fs checking: which I knew how much developers need to
conform to reclaim-fs, or what it is

Your list will probably look totally different and have extra items,
and I'll be happy if it completely contradicts my list.

Anyway, have a good weekend!

Michael

<Prev in Thread] Current Thread [Next in Thread>