xfs file system in process of becoming corrupt; though xfs_repair thinks it's fine! ; -/ (was xfs_dump problem...)

Michael Monnerie michael.monnerie at is.it-management.at
Wed Jun 30 13:25:20 CDT 2010


On Mittwoch, 30. Juni 2010 Linda Walsh wrote:
> But have another XFS problem that is much more reliably persistent.
> I don't know if they are at all related, but since I have this
>  problem that's a bit "stuck", it's easier to "reproduce".
 
I think my problem is similar. I have a Linux ("orion") running Samba. 
A Win7 client uses it to store it's "Windows Backup". That's OK.

From another Linux ("saturn"), I do an rsync via an rsync-module, 
and have already 4 Versions where the ".vhd" file of that Windows Backup 
is destroyed on "saturn". So the corruption happens when starting 
rsync @saturn, copying orion->saturn, both having XFS.

As I cannot delete the broken files, I moved the whole dir away, 
and did an rsync again. The same file destroyed again on saturn.
Some days later, again 2 versions which are destroyed.

The difference to Linda is, I get:
drwx------+ 2 zmi  users     4096 Jun 12 03:15 ./
drwxr-xr-x  7 root root       154 Jun 30 04:00 ../
-rwx------+ 1 zmi  users 56640000 Jun 12 03:05 852c268f-cf1a-11de-b09b-806e6f6e6963.vhd*
??????????? ? ?    ?            ?            ? 852c2690-cf1a-11de-b09b-806e6f6e6963.vhd 

and on dmesg:
[125903.343714] Filesystem "dm-0": corrupt inode 649642 ((a)extents = 5).  Unmount and run xfs_repair.                                                                                                       
[125903.343735] ffff88011e34ca00: 49 4e 81 c0 02 02 00 00 00 00 03 e8 00 00 00 64  IN.............d                                                                                                          
[125903.343756] Filesystem "dm-0": XFS internal error xfs_iformat_extents(1) at line 558 of file /usr/src/packages/BUILD/kernel-desktop-2.6.31.12/linux-2.6.31/fs/xfs/xfs_inode.c.  Caller 0xffffffffa032c0ad
[125903.343763]                                                                                                                                                                                              
[125903.343791] Pid: 17696, comm: ls Not tainted 2.6.31.12-0.2-desktop #1                                                                                                                                    
[125903.343803] Call Trace:                                                                                                                                                                                  
[125903.343821]  [<ffffffff81011a19>] try_stack_unwind+0x189/0x1b0                                                                                                                                           
[125903.343840]  [<ffffffff8101025d>] dump_trace+0xad/0x3a0                                                                                                                                                  
[125903.343858]  [<ffffffff81011524>] show_trace_log_lvl+0x64/0x90                                                                                                                                           
[125903.343876]  [<ffffffff81011573>] show_trace+0x23/0x40                                                                                                                                                   
[125903.343894]  [<ffffffff81552b46>] dump_stack+0x81/0x9e                                                                                                                                                   
[125903.343947]  [<ffffffffa0321b4a>] xfs_error_report+0x5a/0x70 [xfs]                                                                                                                                       
[125903.344085]  [<ffffffffa0321bcc>] xfs_corruption_error+0x6c/0x90 [xfs]                                                                                                                                   
[125903.344248]  [<ffffffffa032bb84>] xfs_iformat_extents+0x234/0x280 [xfs]                                                                                                                                  
[125903.344409]  [<ffffffffa032c0ad>] xfs_iformat+0x28d/0x5a0 [xfs]                                                                                                                                          
[125903.344569]  [<ffffffffa032c542>] xfs_iread+0x182/0x1c0 [xfs]                                                                                                                                            
[125903.344729]  [<ffffffffa0327938>] xfs_iget_cache_miss+0x78/0x250 [xfs]                                                                                                                                   
[125903.344882]  [<ffffffffa0327c3c>] xfs_iget+0x12c/0x1b0 [xfs]                                                                                                                                             
[125903.345036]  [<ffffffffa0347b8e>] xfs_lookup+0xce/0x100 [xfs]                                                                                                                                            
[125903.345256]  [<ffffffffa0354e6c>] xfs_vn_lookup+0x6c/0xc0 [xfs]                                                                                                                                          
[125903.345453]  [<ffffffff81157782>] real_lookup+0x102/0x180                                                                                                                                                
[125903.345473]  [<ffffffff811598c0>] do_lookup+0xd0/0x100                                                                                                                                                   
[125903.345491]  [<ffffffff81159e12>] __link_path_walk+0x522/0x880                                                                                                                                           
[125903.345510]  [<ffffffff8115a6f6>] path_walk+0x66/0xd0                                                                                                                                                    
[125903.345528]  [<ffffffff8115a7cb>] do_path_lookup+0x6b/0xb0                                                                                                                                               
[125903.345546]  [<ffffffff8115a9d1>] user_path_at+0x61/0xc0                                                                                                                                                 
[125903.345565]  [<ffffffff811514d1>] vfs_fstatat+0x41/0x90                                                                                                                                                  
[125903.345584]  [<ffffffff811515ac>] vfs_lstat+0x2c/0x50                                                                                                                                                    
[125903.345602]  [<ffffffff811515fe>] sys_newlstat+0x2e/0x70                                                                                                                                                 
[125903.345621]  [<ffffffff8100c682>] system_call_fastpath+0x16/0x1b                                                                                                                                         
[125903.345645]  [<00007f72dc451e65>] 0x7f72dc451e65

Trying to "xfs_repair -n" seems to find errors, see attachment "repair1.log"
Trying to "xfs_repair" crashes, see attachment "repair2.log"

Saturns kernel is 2.6.31.12-0.2-desktop from openSUSE 11.2, 
xfs_repair is 3.1.2 (I tried down several versions down to 3.0.1, all without success).

Even after xfs_metadump and xfs_mdrestore the error exists, and cannot be 
repaired with xfs_repair, because that crashes.

I've put a new metadump containing only the broken stuff for public review:
http://zmi.at/saturn_bigdata.metadump.only_broken.bz2 (197 MB)

What should I do, apart of ripping the whole filesystem and copying new? 
The problem is, would be destroyed again, just like it did 4 times now.

-- 
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services
http://proteger.at [gesprochen: Prot-e-schee]
Tel: 0660 / 415 65 31

// Wir haben im Moment zwei Häuser zu verkaufen:
// http://zmi.at/langegg/
// http://zmi.at/haus2009/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: repair1.log
Type: text/x-log
Size: 1919 bytes
Desc: not available
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20100630/8e7df9b5/attachment.log>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: repair2.log
Type: text/x-log
Size: 1597 bytes
Desc: not available
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20100630/8e7df9b5/attachment-0001.log>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20100630/8e7df9b5/attachment.sig>


More information about the xfs mailing list