xfs
[Top] [All Lists]

xfs_repair hung...safe to terminate?

To: linux-xfs <linux-xfs@xxxxxxxxxxx>
Subject: xfs_repair hung...safe to terminate?
From: Jon Lewis <jlewis@xxxxxxxxx>
Date: Wed, 15 Jun 2005 20:24:22 -0400 (EDT)
Sender: linux-xfs-bounce@xxxxxxxxxxx
After having a system crash twice today with messages like (from the first
crash):

xfs_iget_core: ambiguous vns: vp/0xc6f0e680, invp/0xecbed200
------------[ cut here ]------------
kernel BUG at debug.c:106!
invalid operand: 0000
nfsd lockd sunrpc autofs eepro100 mii ipt_REJECT iptable_filter ip_tables
xfs raid5 xor ext3 jbd raid1 isp_mod sd_mod scsi_mod
CPU:    1
EIP:    0010:[<f8dbf16e>]    Not tainted
EFLAGS: 00010246

EIP is at cmn_err [xfs] 0x9e (2.4.20-35_39.rh8.0.atsmp)
eax: 00000000   ebx: 00000000   ecx: 00000096   edx: 00000001
esi: f8dd9412   edi: f8dec63e   ebp: 00000293   esp: f5d2bd44
ds: 0018   es: 0018   ss: 0018
Process nfsd (pid: 661, stackpage=f5d2b000)
Stack: f8dd9412 f8dd93e8 f8dec600 ecbed220 7b1f202d 00000000 e4cca100
f8d8aeac
       00000000 f8dda160 c6f0e680 ecbed200 f65d0c00 7b1f202d f7bfcc38
c62aea90
       f65d0924 00000000 00000003 c62aea8c 00000000 00000000 e4cca100
ecbed220
Call Trace:   [<f8dd9412>] .rodata.str1.1 [xfs] 0x11c2 (0xf5d2bd44))
[<f8dd93e8>] .rodata.str1.1 [xfs] 0x1198 (0xf5d2bd48))
[<f8dec600>] message [xfs] 0x0 (0xf5d2bd4c))
[<f8d8aeac>] xfs_iget_core [xfs] 0x45c (0xf5d2bd60))
[<f8dda160>] .rodata.str1.32 [xfs] 0x5a0 (0xf5d2bd68))
[<f8d8b0c3>] xfs_iget [xfs] 0x143 (0xf5d2bdb0))
[<f8da8247>] xfs_vget [xfs] 0x77 (0xf5d2bdf0))
[<f8dbe563>] vfs_vget [xfs] 0x43 (0xf5d2be20))
[<f8dbdc9d>] linvfs_fh_to_dentry [xfs] 0x5d (0xf5d2be30))
[<f8e3a8c6>] nfsd_get_dentry [nfsd] 0xb6 (0xf5d2be5c))
[<f8e3ad17>] find_fh_dentry [nfsd] 0x57 (0xf5d2be80))
[<f8e3b1b9>] fh_verify [nfsd] 0x189 (0xf5d2beb0))
[<f8e19616>] svc_sock_enqueue [sunrpc] 0x1b6 (0xf5d2befc))
[<f8e42bdf>] nfsd3_proc_getattr [nfsd] 0x6f (0xf5d2bf10))
[<f8e44a93>] nfs3svc_decode_fhandle [nfsd] 0x33 (0xf5d2bf28))
[<f8e4b384>] nfsd_procedures3 [nfsd] 0x24 (0xf5d2bf3c))
[<f8e3863e>] nfsd_dispatch [nfsd] 0xce (0xf5d2bf48))
[<f8e4ac98>] nfsd_version3 [nfsd] 0x0 (0xf5d2bf5c))
[<f8e38570>] nfsd_dispatch [nfsd] 0x0 (0xf5d2bf60))
[<f8e1927f>] svc_process_Rsmp_9d8bc81a [sunrpc] 0x45f (0xf5d2bf64))
[<f8e4b384>] nfsd_procedures3 [nfsd] 0x24 (0xf5d2bf84))
[<f8e4acb8>] nfsd_program [nfsd] 0x0 (0xf5d2bf88))
[<f8e38404>] nfsd [nfsd] 0x224 (0xf5d2bfa4))
[<c010758e>] arch_kernel_thread [kernel] 0x2e (0xf5d2bff0))
[<f8e381e0>] nfsd [nfsd] 0x0 (0xf5d2bff8))


Code: 0f 0b 6a 00 08 94 dd f8 83 c4 0c 5b 5e 5f 5d c3 89 f6 55 b8
 <5>xfs_force_shutdown(md(9,2),0x8) called from line 1071 of file
xfs_trans.c.  Return address = 0xf8dbe6eb
Filesystem "md(9,2)": Corruption of in-memory data detected.  Shutting
down
filesystem: md(9,2)
Please umount the filesystem, and rectify the problem(s)

I figured it'd be a good idea to xfs_repair it.  That was a little more
than 4 hours ago.  The fs is an software RAID5:
md2 : active raid5 sdn2[13] sdg2[12] sdm2[11] sdl2[10] sdk2[9] sdj2[8]
sdi2[7] sdh2[6] sdf2[5] sde2[4] sdd2[3] sdc2[2] sdb2[1] sda2[0]
      385414656 blocks level 5, 64k chunk, algorithm 2 [12/12]
[UUUUUUUUUUUU]
md0 : active raid1 sdn1[1] sdg1[0]
      803136 blocks [2/2] [UU]

xfs_repair [version 2.6.9] has gotten to:

Phase 5 - rebuild AG headers and trees...

and seems to have stopped progressing.

root       798 91.8  1.0 45080 41576 pts/1   R    15:57 242:04 xfs_repair -l 
/dev/md0 /dev/md2

Its still using lots of CPU, but there is no disk activity.  Further
searching suggests this might be a kernel issue and not an actual fs
corruption issue.  I'd like to upgrade from 2.4.20-35_39.rh8.0.atsmp to
2.4.20-43_41.rh8.0.atsmp, but the question is, is it safe to stop (kill)
xfs_repair?  Will the fs be mountable if I interrupt xfs_repair at this
point?

----------------------------------------------------------------------
 Jon Lewis                   |  I route
 Senior Network Engineer     |  therefore you are
 Atlantic Net                |
_________ http://www.lewis.org/~jlewis/pgp for PGP public key_________


<Prev in Thread] Current Thread [Next in Thread>