Bugzilla – Bug 259
xfs_freeze gets stuck in "D" state in the function "down"
Last modified: 2004-02-16 07:39:15 CST
A script calls xfs_freeze on a given filesystem, prior to taking a snapshot.
This script is run automatically, every three hours. After running successfully
for several "iterations", xfs_freeze eventually gets stuck in "D" state. "ps -p
<PID of xfs_freeze> -o wchan,args" returns:
down /usr/sbin/xfs_freeze -f <fs2freeze>
A hard reset is needed to recover; until then, the filesystem on which
xfs_freeze is run is frozen, but the rest of the system is available.
This behavior occurs in recent kernels (with the xfssyncd code; in this
particular case CVS of June 19th). In kernels (e.g., XFS Release 1.2) without
the xfssyncd code, lvcreate would get stuck in xfs_check_frozen (see
http://marc.theaimsgroup.com/?l=linux-xfs&m=105277405929107&w=2), and an
xfs_freeze -u <fs> would recover the system (though it did ruin the snapshot).
The key to replicating the problem seems to be in the number of times the script
runs, not on the load at the time the script runs. (The system had not been
heavily loaded within days of the occurence.) This behavior occurred more
frequently in versions close to the date the xfssyncd code went in -- for
example, on a kernel I compiled on June 10th, from CVS update of the same day, I
was able to reproduce the behaviour on the second manual iteration of the
script. A version of this script is attached at:
I updated linux-2.4-xfs from CVS 9/30/2003 13:28; the failure pattern reported
has not occurred after a week of running the test script hourly, so I'm
cautiously optimistic that the problem went away.
Marking as fixed.
Really closing now as it's not reproducible