David Chinner wrote:
On Thu, Nov 15, 2007 at 02:32:17PM +1100, Lachlan McIlroy wrote:
Overall it looks good Dave, just a few comments below.
.....
+int
+xfsaild(
+ void *data)
+{
+ xfs_mount_t *mp = (xfs_mount_t *)data;
+ xfs_lsn_t last_pushed_lsn = 0;
+ long tout = 0;
+
+ while (!kthread_should_stop()) {
+ if (tout)
+ schedule_timeout_interruptible(msecs_to_jiffies(tout));
+
+ /* swsusp */
+ try_to_freeze();
+
+ /* we're either starting or stopping if there is no log */
+ if (!mp->m_log)
+ continue;
It's looks like the log should never be NULL while the xfsaild
thread is running. Could we ASSERT(mp->m_log)?
Already done.
@@ -100,57 +97,105 @@ xfs_trans_push_ail(
spin_unlock(&mp->m_ail_lock);
return (xfs_lsn_t)0;
}
+ if (XFS_LSN_CMP(threshold_lsn, mp->m_ail.xa_target) > 0)
Is this conditional necessary? Can we just call xfsaild_wakeup()
and let it do the same thing?
Already done.
+long
+xfsaild_push(
+ xfs_mount_t *mp,
+ xfs_lsn_t *last_lsn)
+{
+ long tout = 100; /* milliseconds */
+ xfs_lsn_t last_pushed_lsn = *last_lsn;
+ xfs_lsn_t target = mp->m_ail.xa_target;
+ xfs_lsn_t lsn;
+ xfs_log_item_t *lip;
+ int lock_result;
+ int gen;
+ int restarts;
restarts needs to be initialised
Already done.
+ spin_lock(&mp->m_ail_lock);
+ count++;
+ /* Too many items we can't do anything with? */
+ if (stuck > 100)
100? Arbitrary magic number or was there reason for this?
Arbitrary magic number based on observation. basically, if
we are skipping too many items because we can't flush them or
they are already being flushed we back off and given them time
to complete whatever operation is being done. i.e. remove pressure
from the AIL while we can't make progress so traversals don't
slow down further inserts and removæls to/from the AIL.
+ break;
+ /* we're either starting or stopping if there is no log */
+ if (!mp->m_log)
Again, can we ASSERT(mp->m_log)?
Already done.
+ if ((XFS_LSN_CMP(lsn, target) >= 0)) {
+ tout += 20;
+ last_pushed_lsn = 0;
+ } else if ((restarts > XFS_TRANS_PUSH_AIL_RESTARTS) ||
+ (count && (count < (stuck + 10)))) {
If 0 < count < 10 and stuck == 0 then we'll think we couldn't flush
something
- not sure if that is what you intended here.
If we've got to this situation it generally means we've
got an empty AIL. Hence backing off a bit more won't hurt at
all because the log is pretty much clean and we are not likely
to be in a tail-pushing situation in the next few milliseconds.
Ah, yes, good point.
Maybe ((count - stuck) < stuck) ? ie the number of items we successfully
flushed
is less than the number of items we couldn't flush then back off.
Sort of, but that's a 50% rule - what I'm checking is more like a
90% stuck threshold when we break out of the loop at stuck == 100.
If you can think of a better way of expressing that....
something like ((stuck * 100)/count > 90) ?
or adding a bias to the 50% rule, ((count - stuck) * 10 < stuck)
|