xfs
[Top] [All Lists]

Re: xfs file system in process of becoming corrupt; though xfs_repair th

To: "Linda A. Walsh" <xfs@xxxxxxxxx>
Subject: Re: xfs file system in process of becoming corrupt; though xfs_repair thinks it's fine! ; -/ (was xfs_dump problem...)
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Fri, 2 Jul 2010 09:58:02 +1000
Cc: xfs-oss <xfs@xxxxxxxxxxx>
In-reply-to: <4C2AAFC1.9080708@xxxxxxxxx>
References: <4C26A51F.8020909@xxxxxxxxx> <20100628022744.GX6590@dastard> <4C2A749E.4060006@xxxxxxxxx> <20100629232532.GA24712@dastard> <4C2A87FF.7090804@xxxxxxxxxxxx> <4C2A92DA.1020202@xxxxxxxxx> <20100630011647.GD24712@dastard> <4C2AAFC1.9080708@xxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
On Tue, Jun 29, 2010 at 07:45:21PM -0700, Linda A. Walsh wrote:
> To make matters more interesting -- xfsdump can't access a couple of
> files and a directory or two.
> 
> It thinks they are 'stale NFS handles' (I'm not running any NFS file
> systems).
> 
> in @  0.0 kB/s, out @  0.0 kB/s,  0.0 kB total, buffer   0% fullxfsdump: 
> WARNING: unable to open directory: ino 2082342: Stale NFS file handle
> xfsdump: WARNING: unable to open directory: ino 2082343: Stale NFS file handle

xfsdump uses the handle interfaces to open files direct from
bulkstat information, and this is a typical error when bulkstat
returns an inode and it is unlinked before dump opens the handle
created from the bulkstat information.

> in @ 4079 kB/s, out @ 4079 kB/s, 2040 kB total, buffer   0% fullxfsdump: 
> dumping non-directory files
> in @ 68.0 MB/s, out @ 68.0 MB/s, 1209 GB total, buffer   0% fullll
> in @  107 MB/s, out @  105 MB/s, 2200 GB total, buffer   0% fullxfsdump: 
> ending media file
> xfsdump: media file size 2362017678616 bytes
> xfsdump: dump size (non-dir files) : 2361953613176 bytes
> xfsdump: dump complete: 10926 seconds elapsed
> xfsdump: Dump Status: SUCCESS
> 
> Running xfs_db on the file system (finished dumping)
> a block get returns:

Just a reminder - you can't trust xfs_db output on a live mounted
filesystem....

> dir 1133368 block 0 extra leaf entry 5438b33d 79
> dir 1133368 block 0 extra leaf entry 6624beba 71
> dir 1133368 block 0 extra leaf entry 6d832f88 69
> dir 1133368 block 0 extra leaf entry e6279e2d 80
> dir ino 1133368 missing leaf entry for e627de2d/80
> dir ino 1133368 missing leaf entry for 7624beba/71
> dir ino 1133368 missing leaf entry for 5418b33d/79
> dir ino 1133368 missing leaf entry for 6d832f80/69

I'm not sure why the blockget thinks there's a extra
entries in block 0 in the directory, but then says the
entries for the same hash index are missing.

I'd need a metadump of the filesystem to be able to look at it
directly...

> xfs_repair -n now shows:
.....
> Phase 6 - check inode connectivity...
>        - traversing filesystem ...
> entry "10 Otome ha DO MY BEST desho¿ (off vocal).flac" (ino 2359102) in dir 
> 1133368 is a duplicate name, would junk entry
> entry "06 Otome ha DO MY BEST desho¿ Otome ver..flac" (ino 2359100) in dir 
> 1133368 is a duplicate name, would junk entry
> entry "05 Otome ha DO MY BEST desho¿ Hime ver..flac" (ino 2359099) in dir 
> 1133368 is a duplicate name, would junk entry
> entry "04 Otome ha DO MY BEST desho¿ 2007ver..flac" (ino 2359086) in dir 
> 1133368 is a duplicate name, would junk entry
....

Every single filename has some special character in it. Of course,
my question is why are there two copies of the same directory name?
Was the file created twice? How did these files get created? If you
just copy them, does the destination directory end up corrupted?

> It would appear that 2.6.34 might have some problems in it?

I don't think we changed anything at all directory related in
XFS in 2.6.34 so I'm a little perplexed as to why this is suddenly
all happening. Did these problems only show up when you updated to
2.6.34, or can you reproduce them on an older kernel?

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>