Bugzilla – Bug 359
apparent race condition with NFS causes xfs_forced_shutdown
Last modified: 2004-09-13 16:31:54 CDT
At a customer site they were experiencing periodic xfs_forced_shutdowns (in-memory corruption detected). After eliminating hardware as a possibility we started looking at software causes. The configuration is as follows: SMP 2x2.4GHz CPU Dell PE4600 server connected to Adaptec Sanbloc RAIDs kernel is a patched up 2.4.20+contemporary XFS (1.2-ish). With some investigation our customer found that 2 NFS clients moving the same filename around causes the crash After sprinkling some printks in the kernel it appears that the is_bad_inode check in xfs_iget is failing and returning EIO An example of one of these backtraces is: Sep 2 08:13:23 sh15 kernel: xfs_force_shutdown(lvm(58,0),0x8) called from line 1051 of file xfs_trans.c. Return address = 0xc01ff9d9 Sep 2 08:13:23 sh15 kernel: XFS: Transforming an alert into a BUG. Sep 2 08:13:23 sh15 kernel: Filesystem "lvm(58,0)": Corruption of in-memory data detected. Shutting down filesystem: lvm(58,0) Sep 2 08:13:23 sh15 kernel: kernel BUG at debug.c:126! Sep 2 08:13:23 sh15 kernel: invalid operand: 0000 Sep 2 08:13:23 sh15 kernel: dvsdriver esm e1000 tg3 e100 bonding usb-ohci usbcore lvm-mod mptscsih mptctl isense mptbase rtc Sep 2 08:13:23 sh15 kernel: CPU: 0 Sep 2 08:13:23 sh15 kernel: EIP: 0010:[<c0215d15>] Tainted: P Sep 2 08:13:23 sh15 kernel: EFLAGS: 00010246 Sep 2 08:13:23 sh15 kernel: EIP is at icmn_err+0x85/0x95 [kernel] Sep 2 08:13:23 sh15 kernel: eax: 00000067 ebx: 00000000 ecx: 00000001 edx: c0445414 Sep 2 08:13:23 sh15 kernel: esi: c037d161 edi: c03511b0 ebp: ed62bce4 esp: ed62bcd4 Sep 2 08:13:23 sh15 kernel: ds: 0018 es: 0018 ss: 0018 Sep 2 08:13:23 sh15 kernel: Process nfsd (pid: 1326, stackpage=ed62b000) Sep 2 08:13:23 sh15 kernel: Stack: 00000293 ea423580 0000005e c035e320 ed62bd1c c01e6254 00000000 ea423580 Sep 2 08:13:23 sh15 kernel: ed62bd58 ea423580 c035076e eee0be80 c035e320 0000005e 00000001 c035e320 Sep 2 08:13:23 sh15 kernel: 00000000 00000008 ed62bd3c c01e62e1 00000000 eee5b400 c035e320 ed62bd58 Sep 2 08:13:23 sh15 kernel: Call Trace: Sep 2 08:13:23 sh15 kernel: [<c01e6254>] xfs_fs_vcmn_err+0x54/0x70 [kernel] Sep 2 08:13:23 sh15 kernel: [<c01e62e1>] xfs_cmn_err+0x51/0x60 [kernel] Sep 2 08:13:23 sh15 kernel: [<c02099a0>] xfs_do_force_shutdown+0xc0/0xe0 [kernel] Sep 2 08:13:23 sh15 kernel: [<c01ff9d9>] xfs_trans_cancel+0x59/0xd0 [kernel] Sep 2 08:13:23 sh15 kernel: [<c0205efb>] xfs_create+0x57b/0x620 [kernel] Sep 2 08:13:23 sh15 kernel: [<c0211cff>] linvfs_mknod+0x12f/0x260 [kernel] Sep 2 08:13:23 sh15 kernel: [<c0211e46>] linvfs_create+0x16/0x20 [kernel] Sep 2 08:13:23 sh15 kernel: [<c014936e>] vfs_create+0x11e/0x180 [kernel] Sep 2 08:13:23 sh15 kernel: [<c0198932>] nfsd_create_v3+0x292/0x400 [kernel] Sep 2 08:13:23 sh15 kernel: [<c019d904>] nfsd3_proc_create+0x144/0x160 [kernel] Sep 2 08:13:23 sh15 kernel: [<c0193f58>] nfsd_dispatch+0xb8/0x17c [kernel] Sep 2 08:13:23 sh15 kernel: [<c032329b>] svc_process+0x2cb/0x560 [kernel] Sep 2 08:13:23 sh15 kernel: [<c0193d39>] nfsd+0x239/0x3a0 [kernel] Sep 2 08:13:23 sh15 kernel: [<c0193b00>] nfsd+0x0/0x3a0 [kernel] Sep 2 08:13:23 sh15 kernel: [<c0107b96>] kernel_thread+0x26/0x40 [kernel] Sep 2 08:13:23 sh15 kernel: [<c0193b00>] nfsd+0x0/0x3a0 [kernel] Sep 2 08:13:23 sh15 kernel: Sep 2 08:13:23 sh15 kernel: Code: 0f 0b 7e 00 b4 11 35 c0 8d 65 f4 5b 5e 5f 5d c3 80 3d 84 d4
Created attachment 137 [details] script to reproduce Running the following script on 2 NFS clients about 20 times will excercise this bug usually within a minute:
Also possibly related... we occasionally get xfs_iget_core: ambiguous vns messages....
Also happens with 2.4.27 although it takes longer.
Also happens in Fedora test kernel 2.6.8-1.541smp although in a slightly differnt way: Sep 7 07:27:37 sh11 kernel: inode_doinit_with_dentry: getxattr returned 5 for dev=sdb6 ino=4195941 Sep 7 07:27:37 sh11 kernel: post_create: setxattr failed, rc=5 (dev=sdb6 ino=4195941) Sep 7 07:27:37 sh11 kernel: xfs_iget_core: ambiguous vns: vp/0x7cc014a0, invp/0x6cbf0798 Sep 7 07:27:37 sh11 kernel: ------------[ cut here ]------------ Sep 7 07:27:37 sh11 kernel: kernel BUG at fs/xfs/support/debug.c:106! Sep 7 07:27:37 sh11 kernel: invalid operand: 0000 [#1] Sep 7 07:27:37 sh11 kernel: SMP Sep 7 07:27:37 sh11 kernel: Modules linked in: nfs nfsd exportfs lockd xfs md5 ipv6 autofs4 sunrpc button battery asus_acpi ac ohci_hcd cfi_probe gen_probe scb2_flash mtdcore chipreg map_funcs tg3 e1000 floppy ext3 jbd dm_mod aic7xxx mptscsih mptbase sd_mod scsi_mod Sep 7 07:27:37 sh11 kernel: CPU: 2 Sep 7 07:27:37 sh11 kernel: EIP: 0060:[<82cb5ee6>] Not tainted VLI Sep 7 07:27:37 sh11 kernel: EFLAGS: 00010246 (2.6.8-1.541smp) Sep 7 07:27:37 sh11 kernel: EIP is at cmn_err+0xd9/0xe8 [xfs] Sep 7 07:27:37 sh11 kernel: eax: 00000000 ebx: 00000000 ecx: 7ced1af8 edx: 82cbae00 Sep 7 07:27:37 sh11 kernel: esi: 82cbaea8 edi: 82ccbcde ebp: 00000293 esp: 7ced1b00 Sep 7 07:27:37 sh11 kernel: ds: 007b es: 007b ss: 0068 Sep 7 07:27:37 sh11 kernel: Process nfsd (pid: 4952, threadinfo=7ced1000 task=7d137770) Sep 7 07:27:37 sh11 kernel: Stack: 00000000 00000000 6cbf07bc 7cecdb00 00400665 00000000 82c8d1b2 00000000 Sep 7 07:27:37 sh11 kernel: 82cb82b1 7cc014a0 6cbf0798 6cbf07bc 00000000 7fb77400 6cbf0798 673b84ac Sep 7 07:27:37 sh11 kernel: 00000000 00000000 6cbf07bc 023c00c0 7ced1000 00000008 82c8d67f 00400665 Sep 7 07:27:37 sh11 kernel: Call Trace: Sep 7 07:27:37 sh11 kernel: [<82c8d1b2>] xfs_iget_core+0x146/0x58b [xfs] Sep 7 07:27:37 sh11 kernel: [<82c8d67f>] xfs_iget+0x88/0x139 [xfs] Sep 7 07:27:37 sh11 kernel: [<82ca6a51>] xfs_vget+0x34/0x9e [xfs] Sep 7 07:27:37 sh11 kernel: [<82cb528f>] vfs_vget+0x1a/0x1d [xfs] Sep 7 07:27:37 sh11 kernel: [<82cb4f2c>] linvfs_get_dentry+0x3b/0x6c [xfs] Sep 7 07:27:37 sh11 kernel: [<829b802d>] find_exported_dentry+0x2d/0x7e8 [exportfs] Sep 7 07:27:37 sh11 kernel: [<022d9168>] schedule+0x96c/0x9f7 Sep 7 07:27:37 sh11 kernel: [<829b8084>] find_exported_dentry+0x84/0x7e8 [exportfs] Sep 7 07:27:37 sh11 kernel: [<02280167>] __lock_sock+0xdc/0xe4 Sep 7 07:27:37 sh11 kernel: [<021c79f1>] __copy_to_user_ll+0x3f/0x46 Sep 7 07:27:37 sh11 kernel: [<0215a095>] put_user_size+0x1c/0x2d Sep 7 07:27:37 sh11 kernel: [<0228292c>] memcpy_toiovec+0x27/0x47 Sep 7 07:27:37 sh11 kernel: [<02282f6f>] skb_copy_datagram_iovec+0x111/0x1e1 Sep 7 07:27:37 sh11 kernel: [<0228089e>] release_sock+0xa5/0xab Sep 7 07:27:37 sh11 kernel: [<022a44a7>] tcp_recvmsg+0x61e/0x659 Sep 7 07:27:37 sh11 kernel: [<02280983>] sock_common_recvmsg+0x30/0x46 Sep 7 07:27:38 sh11 kernel: [<0227d74e>] sock_recvmsg+0xef/0x10c Sep 7 07:27:38 sh11 kernel: [<0211cab3>] recalc_task_prio+0x128/0x133 Sep 7 07:27:38 sh11 kernel: [<0211cb46>] activate_task+0x88/0x95 Sep 7 07:27:38 sh11 kernel: [<829b8add>] export_decode_fh+0x61/0x6d [exportfs] Sep 7 07:27:38 sh11 kernel: [<82bce9ec>] nfsd_acceptable+0x0/0x112 [nfsd] Sep 7 07:27:38 sh11 kernel: [<829b8a7c>] export_decode_fh+0x0/0x6d [exportfs] Sep 7 07:27:38 sh11 kernel: [<82bcee38>] fh_verify+0x33a/0x4ad [nfsd] Sep 7 07:27:38 sh11 kernel: [<82bce9ec>] nfsd_acceptable+0x0/0x112 [nfsd] Sep 7 07:27:38 sh11 kernel: [<82bd05c3>] nfsd_open+0x1d/0x142 [nfsd] Sep 7 07:27:38 sh11 kernel: [<82bd0c0a>] nfsd_write+0x25/0x289 [nfsd] Sep 7 07:27:38 sh11 kernel: [<82b3a1b3>] svcauth_unix_accept+0x207/0x27a [sunrpc] Sep 7 07:27:38 sh11 kernel: [<82bd6662>] nfsd3_proc_write+0xbf/0xd5 [nfsd] Sep 7 07:27:38 sh11 kernel: [<82bd7cd5>] nfs3svc_decode_writeargs+0x0/0x15a [nfsd] Sep 7 07:27:38 sh11 kernel: [<82bcd751>] nfsd_dispatch+0xba/0x15d [nfsd] Sep 7 07:27:38 sh11 kernel: [<82b3684d>] svc_process+0x331/0x56c [sunrpc] Sep 7 07:27:38 sh11 kernel: [<82bcd4b7>] nfsd+0x274/0x454 [nfsd] Sep 7 07:27:38 sh11 kernel: [<82bcd243>] nfsd+0x0/0x454 [nfsd] Sep 7 07:27:38 sh11 kernel: [<82bcd243>] nfsd+0x0/0x454 [nfsd] Sep 7 07:27:38 sh11 kernel: [<021041f1>] kernel_thread_helper+0x5/0xb Sep 7 07:27:38 sh11 kernel: Code: b2 01 74 08 0f 0b 6e 00 8f ae cb 82 0f b6 05 04 97 cc 82 84 c0 7e 08 0f 0b 6f 00 8f ae cb 82 86 15 04 97 cc 82 55 9d 85 db 75 08 <0f> 0b 6a 00 6f ae cb 82 58 5a 5b 5e 5f 5d c3 55 57 89 d7 56 89 and the filesystem does not shut down...
Created attachment 138 [details] wrapper script for the crash_it script previously posted This script does some simnple setup and then calls the crash_it script (previously attached) to easily reproduce the problem. Under 20 minutes in all cases but well under a minute in the original kernel case. For more robust kernels you simply need to turn up the number of iterations.
Also managed to crash the the XFS 1.3.1 redhat9 based kernel on ftp.oss.sgi.com. kernel-smp-2.4.20-20.9.XFS1.3.1.i686.rpm. Trying to figure out how to use xfsidbg to see if the vnode tracing stuff can show something.....
Tried to extract more info. using xfsidbg so I enabled VNODE_TRACING in there. Unfortunately I don't yet understand what I'm looking at but it looks like this when it crashes. This is taken from the vobj pointer in the behavior object passed to xfs_create: [0]kdb> vn 0xebfe2c80 --> Inode @ 0xebfe2ca0 i_ino = 19328 i_count = 2 i_dev = 0x805 i_size 4096 i_mode = 0x41ff i_nlink = 4 i_rdev = 0x0 i_state = 0x7 i_hash.nxt = 0xc19912a8 i_hash.prv = 0xc19912a8 i_list.nxt = 0xee6a7c64 i_list.prv = 0xebf5b128 i_dentry.nxt = 0xec5d9830 i_dentry.prv = 0xec5d9830 i_sb = 0xee6a7c00 i_op = 0xc044e880 i_data = 0xebfe2d54 nrpages = 0 vnode ptr 0xebfe2c80 --> Vnode @ 0xebfe2c80 vnode: 0xebfe2c80 type VDIR v_inode 0xebfe2ca0 v_bh->bh_first 0xec1855c8 pobj 0xec1855b0 ops xfs_vnodeops flag = 0x0 <> v_trace 0xec445a44 --> Vntrace @ 0xebfe2c80/0xec445a44 entry to xfs_access i_count = 2 cpu = 1 pid = 3371 flag = 0x0 <> ra = linvfs_permission+0x1c entry to xfs_access i_count = 2 cpu = 1 pid = 3371 flag = 0x0 <> ra = linvfs_permission+0x1c entry to xfs_lookup i_count = 2 cpu = 1 pid = 3371 flag = 0x0 <> ra = linvfs_lookup+0x42 entry to xfs_dir_lookup_int i_count = 2 cpu = 1 pid = 3371 flag = 0x0 <> ra = xfs_lookup+0x5b [0]more> Only 'q' or 'Q' are processed at more prompt, input ignored entry to xfs_access i_count = 2 cpu = 0 pid = 3375 flag = 0x0 <> ra = linvfs_permission+0x1c entry to xfs_access i_count = 2 cpu = 0 pid = 3375 flag = 0x0 <> ra = linvfs_permission+0x1c entry to xfs_access i_count = 2 cpu = 0 pid = 3375 flag = 0x0 <> ra = linvfs_permission+0x1c entry to xfs_access i_count = 2 cpu = 0 pid = 3375 flag = 0x0 <> ra = linvfs_permission+0x1c entry to xfs_create i_count = 2 cpu = 0 pid = 3375 flag = 0x0 <> ra = linvfs_mknod+0x12f entry to xfs_access i_count = 2 cpu = 1 pid = 3373 flag = 0x0 <> ra = linvfs_permission+0x1c entry to xfs_access i_count = 2 cpu = 0 pid = 3368 flag = 0x0 <> ra = linvfs_permission+0x1c entry to xfs_access i_count = 2 cpu = 0 pid = 3376 flag = 0x0 <> ra = linvfs_permission+0x1c entry to xfs_access i_count = 2 cpu = 1 pid = 3365 flag = 0x0 <> ra = linvfs_permission+0x1c entry to xfs_access i_count = 2 cpu = 1 pid = 3370 flag = 0x0 <> ra = linvfs_permission+0x1c entry to xfs_access i_count = 2 cpu = 1 pid = 3374 flag = 0x0 <> ra = linvfs_permission+0x1c [0]more> Only 'q' or 'Q' are processed at more prompt, input ignored entry to xfs_access i_count = 2 cpu = 0 pid = 3369 flag = 0x0 <> ra = linvfs_permission+0x1c Not sure this is the correct vnode though seems to be VDIR....
output from xtp on failed transaction: [1]kdb> xtp 0xef0d7300 tp 0xef0d7300 type CREATE mount 0xeab14800 flags xtp 0x7 <dirty sb_dirty perm_log_res > callback 0xef0d7304 forw 0x00000000 back 0x00000000 log res 157880 block res 40 block res used 0 rt res 0 rt res used 0 ticket 0xe8310288 lsn [0:0] commit_lsn [0:0] callback 0x00000000 callarg 0x00000000 icount delta 0 ifree delta -1 blocks delta 0 res blocks delta 0 rt delta 0 res rt delta 0 ag freeblks delta 0 ag flist delta 0 ag btree delta 0 dblocks delta 0 agcount delta 0 imaxpct delta 0 rextsize delta 0 rbmblocks delta 0 rblocks delta 0 rextents delta 0 rextslog delta 0 dqinfo 0x00000000 log items: chunk 0 index 0 item 0xe851b364 size 0 flags lic 0x1 <dirty > type buf mountp 0xeab14800 flags log 0x1 <in ail > ail forw 0xe851b2d0 ail back 0xe92bd678 lsn [10:3995] desc ef0d73c0 ops 0xc0440200 iodonefunc &0xc01cba90 [1]more> buf 0xeaf43e80 recur 0 refcount 1 flags:dirty logged size 2 blkno 0x2 len 0x1 map size 1 map 0xe851b3b4 blf flags: chunk 0 index 1 item 0xe851b2d0 size 0 flags lic 0x1 <dirty > type buf mountp 0xeab14800 flags log 0x1 <in ail > ail forw 0xe8b6cbf0 ail back 0xe851b364 lsn [10:3995] desc ef0d73c8 ops 0xc0440200 iodonefunc &0xc01cba90 buf 0xeaf5b680 recur 0 refcount 1 flags:dirty logged size 2 blkno 0x18 len 0x8 map size 2 map 0xe851b320 blf flags: log busy free 31, list: Chunk 0 at 0xef0d744c next 0x00000000 free 0x7fffffff unused 0 00: ag 0 idx 0 01: ag 0 idx 0 02: ag 0 idx 0 03: ag 0 idx 0 04: ag 0 idx 0 05: ag 0 idx 0 06: ag 0 idx 0 07: ag 0 idx 0 08: ag 0 idx 0 [1]more> 09: ag 0 idx 0 10: ag 0 idx 0 11: ag 0 idx 0 12: ag 0 idx 0 13: ag 0 idx 0 14: ag 0 idx 0 15: ag 0 idx 0 16: ag 0 idx 0 17: ag 0 idx 0 18: ag 0 idx 0 19: ag 0 idx 0 20: ag 0 idx 0 21: ag 0 idx 0 22: ag 0 idx 0 23: ag 0 idx 0 24: ag 0 idx 0 25: ag 0 idx 0 26: ag 0 idx 0 27: ag 0 idx 0 28: ag 0 idx 0 29: ag 0 idx 0 30: ag 0 idx 0
reproduced the bug on the cvs tree from september 10 at 11:35. Time to failure running the ci_wrap over localhost nfs mount was 30 minutes using 4 mount moints.
*** This bug has been marked as a duplicate of 309 ***
Created attachment 139 [details] Patch to cover up race in xfs_iget This patch appears to address the original problem that we hit with EIO getting returned from xfs_iget after is_bad_inode. We have a similar patch for retrying to address the "ambiguous vns" but since Russell thinks its the same problem as 309 I won't post it here to cause confusion.