Dave Chinner wrote:
On Fri, Sep 26, 2008 at 11:32:43AM +1000, Lachlan McIlroy wrote:
Dave Chinner wrote:
On Thu, Sep 25, 2008 at 06:43:43PM +1000, Peter Leckie wrote:
However xfssyncd has had a long history of the task being woken up
from other code,
so it looks like it's simply not safe for either the aild or xfssyncd
to sleep on a queue assuming that
no one else will wake the processes up.
Given that both xfsaild and xfssyncd are supposed to be doing
non-blocking flushes, neither of them should ever be waiting on a
pinned item, therefore fixing that problem in xfs_qm_dqflush()
should make this problem go away. It will also substantially
reduce tehnumber of log forces being triggered by dquot writeback
which will have positive impact on performance, too.
So I would say the fix I proposed is a good solution for this issue.
but it doesn't fix the underlying problem that was causing the
spurious wakeups, which is the fact that xfs_qm_dqflush() is not
obeying non-blocking flush directions.
The underlying problem has nothing to do with xfs_qm_dqflush() - the
spurious wakeups are caused by calls to wake_up_process() that arbitrarily
wake up a process that is in a state where it shouldn't be woken up.
Spurious wakeups are causing problems in a place where we should not
even be sleeping. If you don't sleep there, you can't get spurious
wakeups....
If we don't fix the spurious wakeups then we could easily re-introduce this
problem again.
Right, but keep in mind that the patch doesn't prevent spurious
wakeups - it merely causes the thread to wakeup and go back to sleep
Yes that's right and it's why I suggested replacing the uses of wake_up_process
with wake_up and a wait queue where both the xfsaild and xfssyncd threads can
have a wait queue specific to them. This way we only wake them up if they are
sleeping on that wait queue and not somewhere else waiting for a different
event.
I'm pretty sure that will be a safe change to make.
when a spurious wakeup occurs. The patch I posted avoids the
spurious wakeup problem completely, which is what we should be
aiming to do given it avoids the overhead of 2 context switches
and speeds up the rate at which we can flush unpinned dquots.
That being said, I agree that the original patch is still desirable,
though not from a bug-fix perspective. It's a cleanup and
optimisation patch, with the nice side effect of preventing future
occurrences of the spurious wakeup problem....
Cheers,
Dave.
|