On Fri, Jan 09, 2015 at 12:12:04PM -0600, Eric Sandeen wrote:
> I had a case reported where a system under high stress
> got deadlocked. A btree split was handed off to the xfs
> allocation workqueue, and it is holding the xfs_ilock
> exclusively. However, other xfs_end_io workers are
> not running, because they are waiting for that lock.
> As a result, the xfs allocation workqueue never gets
> run, and everything grinds to a halt.
I'm having a difficult time following the exact deadlock. Can you
please elaborate in more detail?
> To be honest, it's not clear to me how the workqueue
> subsystem manages this sort of thing. But in testing,
> making the allocation workqueue high priority so that
> it gets added to the front of the pending work list,
> resolves the problem. We did similar things for
> the xfs-log workqueues, for similar reasons.
Ummm, this feel pretty voodoo. In practice, it'd change the order of
things being executed and may make certain deadlocks unlikely enough,
but I don't think this can be a proper fix.
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index e5bdca9..9c549e1 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -874,7 +874,7 @@ xfs_init_mount_workqueues(
> goto out_destroy_log;
> mp->m_alloc_workqueue = alloc_workqueue("xfs-alloc/%s",
> - WQ_MEM_RECLAIM|WQ_FREEZABLE, 0, mp->m_fsname);
> + WQ_MEM_RECLAIM|WQ_FREEZABLE|WQ_HIGHPRI, 0, mp->m_fsname);
And this at least deserves way more explanation.
> if (!mp->m_alloc_workqueue)
> goto out_destroy_eofblocks;