xfsrestore report SUCCESS but not restore all files in kernel 3.17
Tommy Wu
wu.tommy at gmail.com
Thu Oct 30 20:51:27 CDT 2014
After apply this patch, xfsdump will hang again.
the log for xfsdump:
https://mega.co.nz/#!tUgz3QrC!q7Ix53sl9lYCg0iKRHjgOqF0tZJAVKHBNz0OHWof-ds
But 'echo w > /proc/sysrq-trigger' no any blocked task in dmesg.
2014-10-31 8:09 GMT+08:00 Dave Chinner <david at fromorbit.com>:
> On Fri, Oct 31, 2014 at 10:04:57AM +1100, Dave Chinner wrote:
> > On Thu, Oct 30, 2014 at 04:39:01PM +0800, Tommy Wu wrote:
> > > I'm confusing now...
> > >
> > > in the test VM and one of my machine, rebuild xfsdump/xfsprogs will
> make
> > > the test work.
> >
> > I know why, too.
> >
> > > the xfsdump command I use:
> > > xfsdump -v debug -l 0 -o -p 300 -J -F -M test -L test -f test.xfsdump
> > > /mnt/backup > xfsdump.log
> > > the log:
> > >
> https://mega.co.nz/#!xIhTjD6Y!Gwp_ilW5UyRiqn2DSxPTHM0TH5qBghMlKXAmHoWk2iQ
> >
> > This indicates the bulkstat on the dump side was terminated
> > prematurely.
> >
> > > the xfsrestore command I use:
> > > xfsrestore -v debug,tree=nitty -p 300 -J -f test.xfsdump
> /vol/backup/x/ >
> > > xfsrestore.log
> > > the log:
> > >
> https://mega.co.nz/#!oZIFWQST!VUY90EZ8XvZDYy9GwF5mqtQVtWNW90Sar5MZhNzGbYI
> >
> > And the errors in the restore side are because there are missing
> > directory inodes in the dump, hence orphaning occurs.
> >
> > It might take me a little while to come up with a fix - I've noticed
> > a few things that I need to check out in more detail in tracking
> > this down....
>
> Can you try this patch?
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david at fromorbit.com
>
> xfs: handle munged bulkstat cookies correctly
>
> From: Dave Chinner <dchinner at redhat.com>
>
> xfsdump munges the last inode cookie that bulkstat passes to
> userspace so subsequent calls can restart at the last inode that
> bulkstat parsed. xfsdump thinks that it needs to decrement the
> cookie so that it catches the last inode, and this affects the
> tree search initialisation code.
>
> By decrementing the lastino cookie, the restart inode number can
> point to non-existent inode number. In 3.16, bulkstat would simply
> increment the cookie until it either found the next valid inode or
> the end of the AG, at which point it would just continue on as
> though nothing had happened. This is a relatively difficult
> situation to contrive so is extremely hard to test for, but is
> likely to occur fairly frequently in the real world.
>
> The current code, however, treats a cookie that points off into the
> weeds as invalid and returns a error. This has the effect of causing
> xfsdump to think it has reached the end of the bulkstat scan, and so
> if it's cookie munging happens just wrong it will cause incomplete
> dumps to occur. These dumps still have all the file data in them,
> just he directory structure is broken and so restore will end up
> with lots and lots of files in the orphanage (i.e. lost+found).
>
> IOWs, we broke xfsdump in a nasty, subtle way that users don't
> notice until they restore a dump.
>
> To fix this, restore the old behaviour of incrementing the inode
> number derived from the cookie when we fail to find a matching inode
> record.
>
> cc: stable at vger.kernel.org
> Reported-by: Tommy Wu <wu.tommy at gmail.com>
> Signed-off-by: Dave Chinner <dchinner at redhat.com>
> ---
> fs/xfs/xfs_itable.c | 34 ++++++++++++++++++++++++++++++----
> 1 file changed, 30 insertions(+), 4 deletions(-)
>
> diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
> index 7765ff7..dcc0c99 100644
> --- a/fs/xfs/xfs_itable.c
> +++ b/fs/xfs/xfs_itable.c
> @@ -208,6 +208,15 @@ xfs_bulkstat_ichunk_ra(
> * if we found the chunk. If the inode was not the last in the chunk and
> there
> * are some left allocated, update the data for the pointed-to record as
> well as
> * return the count of grabbed inodes.
> + *
> + * If this is the first lookup in a bulkstat call, we have to be careful
> about
> + * the agino that is passed - it may not point to a valid inode record.
> xfsdump,
> + * in particular, screws with the lastino cookie between bulkstat calls
> and so
> + * can point it off into the weeds. Hence we can't return an error here if
> + * the agino doesn't point to the correct inode chunk. The previous code
> simply
> + * incremented agino and tried again. This will work for xfsdump as the
> next
> + * lookup will either find the correct chunk or the end of the AG, so
> let's just
> + * keep doing that.
> */
> STATIC int
> xfs_bulkstat_grab_ichunk(
> @@ -226,7 +235,7 @@ xfs_bulkstat_grab_ichunk(
> return error;
> if (!stat) {
> *icount = 0;
> - return error;
> + return 0;
> }
>
> /* Get the record, should always work */
> @@ -236,8 +245,11 @@ xfs_bulkstat_grab_ichunk(
> XFS_WANT_CORRUPTED_RETURN(stat == 1);
>
> /* Check if the record contains the inode in request */
> - if (irec->ir_startino + XFS_INODES_PER_CHUNK <= agino)
> - return -EINVAL;
> + if (irec->ir_startino + XFS_INODES_PER_CHUNK <= agino) {
> + /* oops, someone is feeding us garbage! */
> + *icount = -1;
> + return 0;
> + }
>
> idx = agino - irec->ir_startino + 1;
> if (idx < XFS_INODES_PER_CHUNK &&
> @@ -428,6 +440,17 @@ xfs_bulkstat(
> error = xfs_bulkstat_grab_ichunk(cur, agino,
> &icount, &r);
> if (error)
> goto del_cursor;
> + if (icount < 0) {
> + /*
> + * A pox be cast over xfsdump!
> + *
> + * No record containing agino was found.
> Bump
> + * it, try again until we find a chunk or
> fall
> + * off the end of the AG.
> + */
> + agino++;
> + goto del_cursor;
> + }
> if (icount) {
> irbp->ir_startino = r.ir_startino;
> irbp->ir_freecount = r.ir_freecount;
> @@ -435,14 +458,17 @@ xfs_bulkstat(
> irbp++;
> agino = r.ir_startino +
> XFS_INODES_PER_CHUNK;
> }
> +
> /* Increment to the next record */
> error = xfs_btree_increment(cur, 0, &tmp);
> } else {
> /* Start of ag. Lookup the first inode chunk */
> error = xfs_inobt_lookup(cur, 0, XFS_LOOKUP_GE,
> &tmp);
> }
> - if (error)
> + if (error || tmp == 0) {
> + end_of_ag = 1;
> goto del_cursor;
> + }
>
> /*
> * Loop through inode btree records in this ag,
>
--
Tommy Wu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20141031/e357557c/attachment.html>
More information about the xfs
mailing list