[Top] [All Lists]

Filesystem corruption writing out unlinked inodes

To: xfs@xxxxxxxxxxx
Subject: Filesystem corruption writing out unlinked inodes
From: Lachlan McIlroy <lachlan@xxxxxxx>
Date: Tue, 02 Sep 2008 14:48:49 +1000
Reply-to: lachlan@xxxxxxx
User-agent: Thunderbird (X11/20080707)
I've been looking into a case of filesystem corruption and found
that we are flushing unlinked inodes after the inode cluster has
been freed - and potentially reallocated as something else.  The
case happens when we unlink the last inode in a cluster and that
triggers the cluster to be released.

The code path of interest here is:

                -> queues inode on deleted inodes list

... and later on


When the inode is unlinked it gets logged in a transaction so
xfs_iflush() considers it dirty and writes it out but by this
time the cluster has been reallocated.  If the cluster is
reallocated as user data then the checks in xfs_imap_to_bp will
complain because the inode magic will be incorrect but if the
cluster is reallocated as another inode cluster then these checks
wont detect that.

I modified xfs_iflush() to bail out if we try to flush an
unlinked inode (ie nlink == 0) and that avoids the corruption but
xfs_repair now has problems with inodes marked as free but with
non-zero nlink counts.  Do we really want to write out unlinked
inodes?  Seems a bit redundant.

Other options could be to delay the release of the inode cluster
until the inode has been flushed or move the flush into xfs_ifree()
before releasing the cluster.  Looking at xfs_ifree_cluster() it
scans the inodes in a cluster and tries to lock them and mark them
stale - maybe we can leverage this and avoid flushing staled inodes.
If so we'd need to tighten up the locking.

Does anyone have suggestions which direction we should take?


<Prev in Thread] Current Thread [Next in Thread>