On Wed, May 18, 2016 at 08:35:49AM +1000, Dave Chinner wrote:
> On Tue, May 17, 2016 at 04:49:12PM +0200, Peter Zijlstra wrote:
> > On Tue, May 17, 2016 at 09:10:56AM +1000, Dave Chinner wrote:
> >
> > > The reason we don't have lock clases for the ilock is that we aren't
> > > supposed to call memory reclaim with that lock held in exclusive
> > > mode. This is because reclaim can run transactions, and that may
> > > need to flush dirty inodes to make progress. Flushing dirty inode
> > > requires taking the ilock in shared mode.
> > >
> > > In the code path that was reported, we hold the ilock in /shared/
> > > mode with no transaction context (we are doing a read-only
> > > operation). This means we can run transactions in memory reclaim
> > > because a) we can't deadlock on the inode we hold locks on, and b)
> > > transaction reservations will be able to make progress as we don't
> > > hold any locks it can block on.
> >
> > Just to clarify; I read the above as that we cannot block on recursive
> > shared locks, is this correct?
> >
> > Because we can in fact block on down_read()+down_read() just fine, so if
> > you're assuming that, then something's busted.
>
> The transaction reservation path will run down_read_trylock() on the
> inode, not down_read(). Hence if there are no pending writers, it
> will happily take the lock twice and make progress, otherwise it
> will skip the inode and there's no deadlock. If there's a pending
> writer, then we have another context that is already in a
> transaction context and has already pushed the item, hence it is
> only in the scope of the current push because IO hasn't completed
> yet and removed it from the list.
>
> > Otherwise, I'm not quite reading it right, which is, given the
> > complexity of that stuff, entirely possible.
>
> There's a maze of dark, grue-filled twisty passages here...
OK; I might need a bit more again.
So now the code does something like:
down_read(&i_lock); -- lockdep marks lock as held
kmalloc(GFP_KERNEL); -- lockdep marks held locks as
ENABLED_RECLAIM_FS
--> reclaim()
down_read_trylock(&i_lock); -- lockdep does _NOT_ mark as
USED_IN_RECLAIM_FS
Right?
My 'problem' is that lockdep doesn't consider a trylock for the USED_IN
annotation, so the i_lock class will only get the ENABLED tag but not
get the USED_IN tag, and therefore _should_ not trigger the inversion.
So what exactly is triggering the inversion?
|