Debug of xfstest 234 hang

To: Dave Chinner <david@xxxxxxxxxxxxx>, Alex Elder <aelder@xxxxxxx>
Subject: Debug of xfstest 234 hang
From: Chandra Seetharaman <sekharan@xxxxxxxxxx>
Date: Thu, 10 Nov 2011 15:16:33 -0600
Cc: XFS Mailing List <xfs@xxxxxxxxxxx>
Organization: IBM
Reply-to: sekharan@xxxxxxxxxx
Hi Dave, Alex,

Debugging using trace, crash and systemtap, I found that the hang
happens when xfs_sync_worker() (thru kworker) gets stuck in xlog_wait()
while reserving a transaction log buffer for the dummy log.

I also found that even though xfsaild_push() keeps getting invoked, it
doesn't do anything to push the log to the disk, since the
ailp->xa_target has not been changed since it has been called from the
process stack a while back.

So, I thought, resetting the target to the max value would help nudge
the flow of ail to the disk. So, I added the following code. 
diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
index ed9252b..f59fd9f 100644
--- a/fs/xfs/xfs_trans_ail.c
+++ b/fs/xfs/xfs_trans_ail.c
@@ -534,6 +534,10 @@ out_done:
                ailp->xa_last_pushed_lsn = 0;
+       lsn = xfs_ail_max_lsn(ailp);
+       smp_wmb();
+       xfs_trans_ail_copy_lsn(ailp, &ailp->xa_target, &lsn);
+       smp_wmb();
        return tout;

and it seem to do the magic.

With this change, test 234 runs fine.

Is this a good fix, bad fix, overkill... ?

Please let me know.



