On Fri, Jul 30, 2010 at 08:55:46AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> Under heavy load parallel metadata loads (e.g. dbench), we can fail
> to mark all the inodes in a cluster being freed as XFS_ISTALE as we
> skip inodes we cannot get the XFS_ILOCK_EXCL or the flush lock on.
> When this happens and the inode cluster buffer has already been
> marked stale and freed, inode reclaim can try to write the inode out
> as it is dirty and not marked stale. This can result in writing th
> metadata to an freed extent, or in the case it has already
> been overwritten trigger a magic number check failure and return an
> EUCLEAN error such as:
> Filesystem "ram0": inode 0x442ba1 background reclaim flush failed with 117
> Fix this by ensuring that we hoover up all in memory inodes in the
> cluster and mark them XFS_ISTALE when freeing the cluster.
Why do you move the loop over the log items around? From all that
I can see the original place is much better as we just have to loop
over the items once. Then once we look up the inodes in memory
we skip over the inodes that already are stale, so the behaviour
should be the same. Also instead of the i-- and continue for the
lock failure an explicit goto retry would make it a lot more obvious.
The actual change of not skipping inodes looks good to me.