Debug of xfstest 234 hang
Chandra Seetharaman
sekharan at us.ibm.com
Thu Nov 10 15:16:33 CST 2011
Hi Dave, Alex,
Debugging using trace, crash and systemtap, I found that the hang
happens when xfs_sync_worker() (thru kworker) gets stuck in xlog_wait()
while reserving a transaction log buffer for the dummy log.
I also found that even though xfsaild_push() keeps getting invoked, it
doesn't do anything to push the log to the disk, since the
ailp->xa_target has not been changed since it has been called from the
process stack a while back.
So, I thought, resetting the target to the max value would help nudge
the flow of ail to the disk. So, I added the following code.
------------------
diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
index ed9252b..f59fd9f 100644
--- a/fs/xfs/xfs_trans_ail.c
+++ b/fs/xfs/xfs_trans_ail.c
@@ -534,6 +534,10 @@ out_done:
ailp->xa_last_pushed_lsn = 0;
}
+ lsn = xfs_ail_max_lsn(ailp);
+ smp_wmb();
+ xfs_trans_ail_copy_lsn(ailp, &ailp->xa_target, &lsn);
+ smp_wmb();
return tout;
}
--------------------
and it seem to do the magic.
With this change, test 234 runs fine.
Is this a good fix, bad fix, overkill... ?
Please let me know.
regards
chandra
More information about the xfs
mailing list