[xfs-masters] xfs deadlock in stable kernel 3.0.4
Christoph Hellwig
hch at infradead.org
Thu Sep 22 17:01:51 CDT 2011
On Fri, Sep 23, 2011 at 07:49:56AM +1000, Dave Chinner wrote:
> On Thu, Sep 22, 2011 at 10:14:57AM -0400, Christoph Hellwig wrote:
> > By default, a wq guarantees non-reentrance only on the same
> > CPU. A work item may not be executed concurrently on the same
> > CPU by multiple workers but is allowed to be executed
> > concurrently on multiple CPUs. This flag makes sure
> > non-reentrance is enforced across all CPUs. Work items queued
> > to a non-reentrant wq are guaranteed to be executed by at most
> > one worker system-wide at any given time.
> >
> > So this still seems to preferable for the ail workqueue, and should be
> > able to replace the XFS_AIL_PUSHING_BIT protections.
>
> No, we can't. WQ_NON_REENTRANT only protects against concurrency on
> the same CPU, not across all CPUs - it still allows concurrent
> per-CPU work processing on the same work item.
Non concurrently for a given work_struct on the same CPU is the default,
WQ_NON_REENTRANT extents that to not beeing exectuted concurrently at
all. Check the documentation above again, or the code - just look
for the only occurance of WQ_NON_REENTRANT in kernel/workqueue.c and
the surronuding code (e.g. find_worker_executing_work and the
current_work field in struct worker)
> However, we want only a *single* AIL worker instance executing per
> filesystem, not per-cpu per filesystem. Concurrent per-filesystem
> workers will simply bash on the AIL lock trying to walk the AIL at
> the same time, and this is precisely the issue the single AIL worker
> setup is avoiding. The XFS_AIL_PUSHING_BIT is what enforces the
> single per-filesystem push worker running at any time.
I think that's exactly what WQ_NON_REENTRANT is intended for.
More information about the xfs
mailing list