Still, don't check it in until we understand whether sv_t's are
completely broken or not...
Well I added some tracing code to the __wake_up_common, however it never
tripped
which made me think "are we even being woken up from the wait queue", or
is someone
directly waking us up from the task struct. So I had a look and found
the following.
xfsaild_wakeup(
xfs_mount_t *mp,
xfs_lsn_t threshold_lsn)
{
mp->m_ail.xa_target = threshold_lsn;
wake_up_process(mp->m_ail.xa_task);
}
Which is indirectly called from xlog_grant_push_ail, which is called
from various other
places.
In fact this bug is not restricted to the aild the xfssyncd also hit
this issue a number of times
during todays testing where it was woken while waiting on sv_wait for
the pincount to drop
to zero.
It also is woken up from a number of functions in xfs_super.c including
xfs_syncd_queue_work(), xfs_sync_worker(), xfs_fs_sync_super()
The change that introduced the wake_up on the aild was introduced from
modid: xfs-linux-melb:xfs-kern:30371a
Move AIL pushing into it's own thread
However xfssyncd has had a long history of the task being woken up from
other code,
so it looks like it's simply not safe for either the aild or xfssyncd to
sleep on a queue assuming that
no one else will wake the processes up.
So I would say the fix I proposed is a good solution for this issue.
However there are other functions that use sv_wait and should also be
fixed in a similar way so I'll
look into the other callers and prepare a patch tomorrow.
Thanks,
Pete
|