xfs
[Top] [All Lists]

Re: XFS Lock debugging noise or real problem?

To: Linda Walsh <xfs@xxxxxxxxx>
Subject: Re: XFS Lock debugging noise or real problem?
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 14 Aug 2008 10:41:01 +1000
Cc: xfs-oss <xfs@xxxxxxxxxxx>, LKML <linux-kernel@xxxxxxxxxxxxxxx>, Eric Sandeen <sandeen@xxxxxxxxxxx>
In-reply-to: <48A35A99.1080300@xxxxxxxxx>
Mail-followup-to: Linda Walsh <xfs@xxxxxxxxx>, xfs-oss <xfs@xxxxxxxxxxx>, LKML <linux-kernel@xxxxxxxxxxxxxxx>, Eric Sandeen <sandeen@xxxxxxxxxxx>
References: <48A093A7.40606@xxxxxxxxx> <48A09CA9.9080705@xxxxxxxxxxx> <48A0F686.2090700@xxxxxxxxx> <48A0F9FC.1070805@xxxxxxxxxxx> <48A20E9E.9090100@xxxxxxxxx> <20080813005852.GW6119@disturbed> <48A35A99.1080300@xxxxxxxxx>
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: Mutt/1.5.18 (2008-05-17)
On Wed, Aug 13, 2008 at 03:05:13PM -0700, Linda Walsh wrote:
> Dave Chinner wrote:
>> Once again,
>> "a problem with the generic code inverting the normal lock order".
>>
>> This one cannot deadlock, though, because by definition
>> any inode on the unused list is, well, unused and hence we can't be
>> holding a reference to it...
> ----
>
>       This is great, maybe...but what do you mean by "generic"?

generic code == non-filesystem specific kernel code that interfaces
with the filesystem code.

>       Is this generic in the FS layer such that we'd see
> this with all FS types?

Any filesystem that does memory allocation with the same type of
lock held that it might take when reclaiming a inode.

This is a problem where we go:

        XFS: lock inode
        XFS: allocate memory
          VM: free some memory
            VM: shrink slab
              VM: prune inode cache (takes iprune_mutex)
                XFS: lock inode

i.e. the VM recurses back into the filesystem and lockdep sees
a different lock ordering.

> I'd *like* to keep lock provability 'on' -- but I don't want
> to waste people's time chasing after non-problems and so far I've
> seen at least 3 different locking sequences that all appear to be
> harmless.
>
>       The problem with false positives is that it will either force
> the user to ignore (or turn off) the validation code, or generate
> periodic noise when these things arise...

Basically we've been told by the lockdep folk that the best way
to avoid these false positives is to effectively turn off lockdep
for all places where the inode is locked in the inode reclaim path.
That means lockdep would be mostly useless for XFS - I'd prefer to get
false positives reported than miss a rare case where it's really
telling the truth.

I've asked the lockdep ppl to treat stuff like memory reclaim and
the iprune_mutex specially because of this recursive calling nature
of memory reclaim, but so far nothing has happened....

FWIW, I think that recent changes have resulted in the xfs_fsr case
(swap_extents) being annotated properly so that one should go
away.

>       Isn't it generally considered pretty 'bad' to generate so many
> false positives -- or is lock-proving only for for "lock debugging" --
> and not to be used except on development or test systems?

Well, any debugging code is really designed for test and dev systems,
not for production systems.....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx


<Prev in Thread] Current Thread [Next in Thread>