backport 7a29ac474a47eb8cf212b45917683ae89d6fa13b to stable ?
Jean-Tiare Le Bigot
jean-tiare.le-bigot at corp.ovh.com
Tue Feb 23 10:13:35 CST 2016
Hi,
We've hit kernel hang related to XFS reclaim under heavy I/O load on a
couple of storage servers using XFS over flashcache over a 3.13.y kernel.
On the crash dumps, kthreadd is blocked, waiting for XFS to reclaim some
memory but the related reclaim job is queued on a worker_pool stuck
waiting for some I/O, itself depending on other jobs on other queues
which would require additional threads to go forward. Unfortunately
kthreadd is blocked.
The host has plenty of memory (~128GB), about 80% of which being used
for the page cache.
It looks like this is fixed by commit
7a29ac474a47eb8cf212b45917683ae89d6fa13b. We manually applied a fix to
our internal branch but I could not find a similar commit on the
longterm branches. Maybe it could be a good candidate for backport for
other users ?
On linux-3.14.y, this would be
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index d971f49..36af881 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -858,17 +858,17 @@ xfs_init_mount_workqueues(
goto out_destroy_unwritten;
mp->m_reclaim_workqueue = alloc_workqueue("xfs-reclaim/%s",
- 0, 0, mp->m_fsname);
+ WQ_MEM_RECLAIM, 0, mp->m_fsname);
if (!mp->m_reclaim_workqueue)
goto out_destroy_cil;
mp->m_log_workqueue = alloc_workqueue("xfs-log/%s",
- 0, 0, mp->m_fsname);
+ WQ_MEM_RECLAIM, 0, mp->m_fsname);
if (!mp->m_log_workqueue)
goto out_destroy_reclaim;
mp->m_eofblocks_workqueue = alloc_workqueue("xfs-eofblocks/%s",
- 0, 0, mp->m_fsname);
+ WQ_MEM_RECLAIM, 0, mp->m_fsname);
if (!mp->m_eofblocks_workqueue)
goto out_destroy_log;
Regards,
--
Jean-Tiare Le Bigot, OVH
More information about the xfs
mailing list