xfs
[Top] [All Lists]

backport 7a29ac474a47eb8cf212b45917683ae89d6fa13b to stable ?

To: <xfs@xxxxxxxxxxx>
Subject: backport 7a29ac474a47eb8cf212b45917683ae89d6fa13b to stable ?
From: Jean-Tiare Le Bigot <jean-tiare.le-bigot@xxxxxxxxxxxx>
Date: Tue, 23 Feb 2016 17:13:35 +0100
Delivered-to: xfs@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20131118 Thunderbird/17.0.11
Hi,

We've hit kernel hang related to XFS reclaim under heavy I/O load on a
couple of storage servers using XFS over flashcache over a 3.13.y kernel.

On the crash dumps, kthreadd is blocked, waiting for XFS to reclaim some
memory but the related reclaim job is queued on a worker_pool stuck
waiting for some I/O, itself depending on other jobs on other queues
which would require additional threads to go forward. Unfortunately
kthreadd is blocked.
The host has plenty of memory (~128GB), about 80% of which being used
for the page cache.

It looks like this is fixed by commit
7a29ac474a47eb8cf212b45917683ae89d6fa13b. We manually applied a fix to
our internal branch but I could not find a similar commit on the
longterm branches. Maybe it could be a good candidate for backport for
other users ?

On linux-3.14.y, this would be

diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index d971f49..36af881 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -858,17 +858,17 @@ xfs_init_mount_workqueues(
                goto out_destroy_unwritten;

        mp->m_reclaim_workqueue = alloc_workqueue("xfs-reclaim/%s",
-                       0, 0, mp->m_fsname);
+                       WQ_MEM_RECLAIM, 0, mp->m_fsname);
        if (!mp->m_reclaim_workqueue)
                goto out_destroy_cil;

        mp->m_log_workqueue = alloc_workqueue("xfs-log/%s",
-                       0, 0, mp->m_fsname);
+                       WQ_MEM_RECLAIM, 0, mp->m_fsname);
        if (!mp->m_log_workqueue)
                goto out_destroy_reclaim;

        mp->m_eofblocks_workqueue = alloc_workqueue("xfs-eofblocks/%s",
-                       0, 0, mp->m_fsname);
+                       WQ_MEM_RECLAIM, 0, mp->m_fsname);
        if (!mp->m_eofblocks_workqueue)
                goto out_destroy_log;

Regards,

-- 
Jean-Tiare Le Bigot, OVH

<Prev in Thread] Current Thread [Next in Thread>