xfs
[Top] [All Lists]

Re: xfs readdir hang on for-next (3.15.0-rc1)

To: xfs@xxxxxxxxxxx
Subject: Re: xfs readdir hang on for-next (3.15.0-rc1)
From: Brian Foster <bfoster@xxxxxxxxxx>
Date: Mon, 14 Apr 2014 15:08:36 -0400
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140414164313.GA62307@xxxxxxxxxxxxxxx>
References: <20140414164313.GA62307@xxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Mon, Apr 14, 2014 at 12:43:14PM -0400, Brian Foster wrote:
> Hi all,
> 
> This is a heads up that I'm seeing a blatant readdir hang on the current
> for-next with selinux enabled. To reproduce, I format a clean fs, mount
> and attempt an ls.
> 
> The problem does not occur with selinux disabled, if I back out the
> following commit:
> 
> 40194ecc6d78 xfs: reinstate the ilock in xfs_readdir
> 
> ... or if I remove the locking around xfs_attr_get(), so I suspect this
> is another instance of a recursive deadlock. I'm getting no output
> whatsoever in order to confirm this and it also leads to a complete
> system lockup. It's also interesting that this hasn't been observed
> until now, given the above commit was introduced in 3.14. So the above
> commit doesn't appear to be the most recent change that triggers this.
> 
> I reproduced on the latest linus tree and do not reproduce on 3.14, so
> I'm trying to do a bisect to find out what else might have changed to
> trigger this.
> 

This bisected down to:

commit 6f008e72cd111a119b5d8de8c5438d892aae99eb
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date:   Wed Mar 12 13:24:42 2014 +0100

    locking/mutex: Fix debug checks
...

... which suggests something down in the mutex debug code. Indeed, the
problem no longer occurs if I disable kernel debug in my .config. What
is also interesting is that it didn't return when I reenable
DEBUG_KERNEL and DEBUG_MUTEXES alone. It does return when I start to
enable some of the other lock debugging options. FWIW, I also cleared
out my tree and rebuilt from scratch just to be sure that I didn't have
anything stale/broken lying around.

Peter,

Any insight on this? I reproduce on a 4xcpu x86-64 vm, 4GB RAM and a
20GB partition on a virtio disk:

mkfs.xfs -f /dev/vdb1
mount /dev/vdb1 /mnt
ls /mnt/
<lockup>

Note again that SELinux being enabled appears to be a factor. The kernel
.config is attached. Thanks.

Brian

> Brian
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs

Attachment: config
Description: Text document

<Prev in Thread] Current Thread [Next in Thread>