Bug 359 - apparent race condition with NFS causes xfs_forced_shutdown
: apparent race condition with NFS causes xfs_forced_shutdown
Status: RESOLVED DUPLICATE of bug 309
Product: XFS
Classification: Unclassified
Component: XFS kernel code
: 1.2.x
: Linux
: major
: ---
Assigned To: XFS power people
:
:
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2004-09-02 06:32 CDT by Greg Dickie
Modified: 2004-09-13 16:31 CDT (History)
2 users (show)

See Also:


Attachments
script to reproduce (384 bytes, text/plain)
2004-09-02 06:38 CDT, Greg Dickie
Details
wrapper script for the crash_it script previously posted (1.02 KB, text/plain)
2004-09-08 09:39 CDT, Greg Dickie
Details
Patch to cover up race in xfs_iget (284 bytes, patch)
2004-09-13 14:31 CDT, Greg Dickie
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Greg Dickie 2004-09-02 06:32:52 CDT
At a customer site they were experiencing periodic xfs_forced_shutdowns
(in-memory corruption detected). After eliminating hardware as a possibility we
started looking at software causes.
The configuration is as follows:
SMP 2x2.4GHz CPU Dell PE4600 server connected to Adaptec Sanbloc RAIDs
kernel is a patched up 2.4.20+contemporary XFS (1.2-ish). 

With some investigation our customer found that 2 NFS clients moving the same
filename around causes the crash

After sprinkling some printks in the kernel it appears that the is_bad_inode
check in xfs_iget is failing and returning EIO

An example of one of these backtraces is:

Sep  2 08:13:23 sh15 kernel: xfs_force_shutdown(lvm(58,0),0x8) called from line
1051 of file xfs_trans.c.  Return address = 0xc01ff9d9
Sep  2 08:13:23 sh15 kernel: XFS: Transforming an alert into a BUG.
Sep  2 08:13:23 sh15 kernel: Filesystem "lvm(58,0)": Corruption of in-memory
data detected.  Shutting down filesystem: lvm(58,0)
Sep  2 08:13:23 sh15 kernel: kernel BUG at debug.c:126!
Sep  2 08:13:23 sh15 kernel: invalid operand: 0000
Sep  2 08:13:23 sh15 kernel: dvsdriver esm e1000 tg3 e100 bonding usb-ohci
usbcore lvm-mod mptscsih mptctl isense mptbase rtc
Sep  2 08:13:23 sh15 kernel: CPU:    0
Sep  2 08:13:23 sh15 kernel: EIP:    0010:[<c0215d15>]    Tainted: P
Sep  2 08:13:23 sh15 kernel: EFLAGS: 00010246
Sep  2 08:13:23 sh15 kernel: EIP is at icmn_err+0x85/0x95 [kernel]
Sep  2 08:13:23 sh15 kernel: eax: 00000067   ebx: 00000000   ecx: 00000001  
edx: c0445414
Sep  2 08:13:23 sh15 kernel: esi: c037d161   edi: c03511b0   ebp: ed62bce4  
esp: ed62bcd4
Sep  2 08:13:23 sh15 kernel: ds: 0018   es: 0018   ss: 0018
Sep  2 08:13:23 sh15 kernel: Process nfsd (pid: 1326, stackpage=ed62b000)
Sep  2 08:13:23 sh15 kernel: Stack: 00000293 ea423580 0000005e c035e320 ed62bd1c
c01e6254 00000000 ea423580
Sep  2 08:13:23 sh15 kernel:        ed62bd58 ea423580 c035076e eee0be80 c035e320
0000005e 00000001 c035e320
Sep  2 08:13:23 sh15 kernel:        00000000 00000008 ed62bd3c c01e62e1 00000000
eee5b400 c035e320 ed62bd58
Sep  2 08:13:23 sh15 kernel: Call Trace:
Sep  2 08:13:23 sh15 kernel:  [<c01e6254>] xfs_fs_vcmn_err+0x54/0x70 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c01e62e1>] xfs_cmn_err+0x51/0x60 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c02099a0>] xfs_do_force_shutdown+0xc0/0xe0 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c01ff9d9>] xfs_trans_cancel+0x59/0xd0 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0205efb>] xfs_create+0x57b/0x620 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0211cff>] linvfs_mknod+0x12f/0x260 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0211e46>] linvfs_create+0x16/0x20 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c014936e>] vfs_create+0x11e/0x180 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0198932>] nfsd_create_v3+0x292/0x400 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c019d904>] nfsd3_proc_create+0x144/0x160 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0193f58>] nfsd_dispatch+0xb8/0x17c [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c032329b>] svc_process+0x2cb/0x560 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0193d39>] nfsd+0x239/0x3a0 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0193b00>] nfsd+0x0/0x3a0 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0107b96>] kernel_thread+0x26/0x40 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0193b00>] nfsd+0x0/0x3a0 [kernel]
Sep  2 08:13:23 sh15 kernel:
Sep  2 08:13:23 sh15 kernel: Code: 0f 0b 7e 00 b4 11 35 c0 8d 65 f4 5b 5e 5f 5d
c3 80 3d 84 d4
Comment 1 Greg Dickie 2004-09-02 06:38:21 CDT
Created attachment 137 [details]
script to reproduce

Running the following script on 2 NFS clients about 20 times will excercise
this bug usually within a minute:
Comment 2 Greg Dickie 2004-09-02 07:14:15 CDT
Also possibly related... we occasionally get 
xfs_iget_core: ambiguous vns 
messages....
Comment 3 Greg Dickie 2004-09-02 16:53:51 CDT
Also happens with 2.4.27 although it takes longer.
Comment 4 Greg Dickie 2004-09-07 07:54:59 CDT
Also happens in Fedora test kernel 2.6.8-1.541smp although in a slightly
differnt way:

Sep  7 07:27:37 sh11 kernel: inode_doinit_with_dentry:  getxattr returned 5 for
dev=sdb6 ino=4195941
Sep  7 07:27:37 sh11 kernel: post_create:  setxattr failed, rc=5 (dev=sdb6
ino=4195941)
Sep  7 07:27:37 sh11 kernel: xfs_iget_core: ambiguous vns: vp/0x7cc014a0,
invp/0x6cbf0798
Sep  7 07:27:37 sh11 kernel: ------------[ cut here ]------------
Sep  7 07:27:37 sh11 kernel: kernel BUG at fs/xfs/support/debug.c:106!
Sep  7 07:27:37 sh11 kernel: invalid operand: 0000 [#1]
Sep  7 07:27:37 sh11 kernel: SMP
Sep  7 07:27:37 sh11 kernel: Modules linked in: nfs nfsd exportfs lockd xfs md5
ipv6 autofs4 sunrpc button battery asus_acpi ac ohci_hcd cfi_probe gen_probe
scb2_flash mtdcore chipreg map_funcs tg3 e1000 floppy ext3 jbd dm_mod aic7xxx
mptscsih mptbase sd_mod scsi_mod
Sep  7 07:27:37 sh11 kernel: CPU:    2
Sep  7 07:27:37 sh11 kernel: EIP:    0060:[<82cb5ee6>]    Not tainted VLI
Sep  7 07:27:37 sh11 kernel: EFLAGS: 00010246   (2.6.8-1.541smp)
Sep  7 07:27:37 sh11 kernel: EIP is at cmn_err+0xd9/0xe8 [xfs]
Sep  7 07:27:37 sh11 kernel: eax: 00000000   ebx: 00000000   ecx: 7ced1af8  
edx: 82cbae00
Sep  7 07:27:37 sh11 kernel: esi: 82cbaea8   edi: 82ccbcde   ebp: 00000293  
esp: 7ced1b00
Sep  7 07:27:37 sh11 kernel: ds: 007b   es: 007b   ss: 0068
Sep  7 07:27:37 sh11 kernel: Process nfsd (pid: 4952, threadinfo=7ced1000
task=7d137770)
Sep  7 07:27:37 sh11 kernel: Stack: 00000000 00000000 6cbf07bc 7cecdb00 00400665
00000000 82c8d1b2 00000000
Sep  7 07:27:37 sh11 kernel:        82cb82b1 7cc014a0 6cbf0798 6cbf07bc 00000000
7fb77400 6cbf0798 673b84ac
Sep  7 07:27:37 sh11 kernel:        00000000 00000000 6cbf07bc 023c00c0 7ced1000
00000008 82c8d67f 00400665
Sep  7 07:27:37 sh11 kernel: Call Trace:
Sep  7 07:27:37 sh11 kernel:  [<82c8d1b2>] xfs_iget_core+0x146/0x58b [xfs]
Sep  7 07:27:37 sh11 kernel:  [<82c8d67f>] xfs_iget+0x88/0x139 [xfs]
Sep  7 07:27:37 sh11 kernel:  [<82ca6a51>] xfs_vget+0x34/0x9e [xfs]
Sep  7 07:27:37 sh11 kernel:  [<82cb528f>] vfs_vget+0x1a/0x1d [xfs]
Sep  7 07:27:37 sh11 kernel:  [<82cb4f2c>] linvfs_get_dentry+0x3b/0x6c [xfs]
Sep  7 07:27:37 sh11 kernel:  [<829b802d>] find_exported_dentry+0x2d/0x7e8
[exportfs]
Sep  7 07:27:37 sh11 kernel:  [<022d9168>] schedule+0x96c/0x9f7
Sep  7 07:27:37 sh11 kernel:  [<829b8084>] find_exported_dentry+0x84/0x7e8
[exportfs]
Sep  7 07:27:37 sh11 kernel:  [<02280167>] __lock_sock+0xdc/0xe4
Sep  7 07:27:37 sh11 kernel:  [<021c79f1>] __copy_to_user_ll+0x3f/0x46
Sep  7 07:27:37 sh11 kernel:  [<0215a095>] put_user_size+0x1c/0x2d
Sep  7 07:27:37 sh11 kernel:  [<0228292c>] memcpy_toiovec+0x27/0x47
Sep  7 07:27:37 sh11 kernel:  [<02282f6f>] skb_copy_datagram_iovec+0x111/0x1e1
Sep  7 07:27:37 sh11 kernel:  [<0228089e>] release_sock+0xa5/0xab
Sep  7 07:27:37 sh11 kernel:  [<022a44a7>] tcp_recvmsg+0x61e/0x659
Sep  7 07:27:37 sh11 kernel:  [<02280983>] sock_common_recvmsg+0x30/0x46
Sep  7 07:27:38 sh11 kernel:  [<0227d74e>] sock_recvmsg+0xef/0x10c
Sep  7 07:27:38 sh11 kernel:  [<0211cab3>] recalc_task_prio+0x128/0x133
Sep  7 07:27:38 sh11 kernel:  [<0211cb46>] activate_task+0x88/0x95
Sep  7 07:27:38 sh11 kernel:  [<829b8add>] export_decode_fh+0x61/0x6d [exportfs]
Sep  7 07:27:38 sh11 kernel:  [<82bce9ec>] nfsd_acceptable+0x0/0x112 [nfsd]
Sep  7 07:27:38 sh11 kernel:  [<829b8a7c>] export_decode_fh+0x0/0x6d [exportfs]
Sep  7 07:27:38 sh11 kernel:  [<82bcee38>] fh_verify+0x33a/0x4ad [nfsd]
Sep  7 07:27:38 sh11 kernel:  [<82bce9ec>] nfsd_acceptable+0x0/0x112 [nfsd]
Sep  7 07:27:38 sh11 kernel:  [<82bd05c3>] nfsd_open+0x1d/0x142 [nfsd]
Sep  7 07:27:38 sh11 kernel:  [<82bd0c0a>] nfsd_write+0x25/0x289 [nfsd]
Sep  7 07:27:38 sh11 kernel:  [<82b3a1b3>] svcauth_unix_accept+0x207/0x27a [sunrpc]
Sep  7 07:27:38 sh11 kernel:  [<82bd6662>] nfsd3_proc_write+0xbf/0xd5 [nfsd]
Sep  7 07:27:38 sh11 kernel:  [<82bd7cd5>] nfs3svc_decode_writeargs+0x0/0x15a [nfsd]
Sep  7 07:27:38 sh11 kernel:  [<82bcd751>] nfsd_dispatch+0xba/0x15d [nfsd]
Sep  7 07:27:38 sh11 kernel:  [<82b3684d>] svc_process+0x331/0x56c [sunrpc]
Sep  7 07:27:38 sh11 kernel:  [<82bcd4b7>] nfsd+0x274/0x454 [nfsd]
Sep  7 07:27:38 sh11 kernel:  [<82bcd243>] nfsd+0x0/0x454 [nfsd]
Sep  7 07:27:38 sh11 kernel:  [<82bcd243>] nfsd+0x0/0x454 [nfsd]
Sep  7 07:27:38 sh11 kernel:  [<021041f1>] kernel_thread_helper+0x5/0xb
Sep  7 07:27:38 sh11 kernel: Code: b2 01 74 08 0f 0b 6e 00 8f ae cb 82 0f b6 05
04 97 cc 82 84 c0 7e 08 0f 0b 6f 00 8f ae cb 82 86 15 04 97 cc 82 55 9d 85 db 75
08 <0f> 0b 6a 00 6f ae cb 82 58 5a 5b 5e 5f 5d c3 55 57 89 d7 56 89


and the filesystem does not shut down...
Comment 5 Greg Dickie 2004-09-08 09:39:17 CDT
Created attachment 138 [details]
wrapper script for the crash_it script previously posted

This script does some simnple setup and then calls the crash_it script
(previously attached) to easily reproduce the problem. Under 20 minutes in all
cases but well under a minute in the original kernel case. For more robust
kernels you simply need to turn up the number of iterations.
Comment 6 Greg Dickie 2004-09-08 09:41:51 CDT
Also managed to crash the the XFS 1.3.1 redhat9 based kernel on ftp.oss.sgi.com.
kernel-smp-2.4.20-20.9.XFS1.3.1.i686.rpm.

Trying to figure out how to use xfsidbg to see if the vnode tracing stuff can
show something.....
Comment 7 Greg Dickie 2004-09-09 10:00:56 CDT
Tried to extract more info. using xfsidbg so I enabled VNODE_TRACING in there.
Unfortunately I don't yet understand what I'm looking at but it looks like this
when it crashes. This is taken from the vobj pointer in the behavior object
passed to xfs_create:

[0]kdb> vn 0xebfe2c80
--> Inode @ 0xebfe2ca0
 i_ino = 19328 i_count = 2 i_dev = 0x805 i_size 4096
 i_mode = 0x41ff         i_nlink = 4  i_rdev = 0x0 i_state = 0x7
 i_hash.nxt = 0xc19912a8 i_hash.prv = 0xc19912a8
 i_list.nxt = 0xee6a7c64 i_list.prv = 0xebf5b128
 i_dentry.nxt = 0xec5d9830 i_dentry.prv = 0xec5d9830
 i_sb = 0xee6a7c00 i_op = 0xc044e880 i_data = 0xebfe2d54 nrpages = 0
  vnode ptr 0xebfe2c80
--> Vnode @ 0xebfe2c80
vnode: 0xebfe2c80 type VDIR
   v_inode 0xebfe2ca0 v_bh->bh_first 0xec1855c8 pobj 0xec1855b0
   ops xfs_vnodeops flag = 0x0 <> 
   v_trace 0xec445a44
--> Vntrace @ 0xebfe2c80/0xec445a44
entry to xfs_access i_count = 2
  cpu = 1 pid = 3371 flag = 0x0 <>   ra = linvfs_permission+0x1c
entry to xfs_access i_count = 2
  cpu = 1 pid = 3371 flag = 0x0 <>   ra = linvfs_permission+0x1c
entry to xfs_lookup i_count = 2
  cpu = 1 pid = 3371 flag = 0x0 <>   ra = linvfs_lookup+0x42
entry to xfs_dir_lookup_int i_count = 2
  cpu = 1 pid = 3371 flag = 0x0 <>   ra = xfs_lookup+0x5b
[0]more>  
Only 'q' or 'Q' are processed at more prompt, input ignored
entry to xfs_access i_count = 2
  cpu = 0 pid = 3375 flag = 0x0 <>   ra = linvfs_permission+0x1c
entry to xfs_access i_count = 2
  cpu = 0 pid = 3375 flag = 0x0 <>   ra = linvfs_permission+0x1c
entry to xfs_access i_count = 2
  cpu = 0 pid = 3375 flag = 0x0 <>   ra = linvfs_permission+0x1c
entry to xfs_access i_count = 2
  cpu = 0 pid = 3375 flag = 0x0 <>   ra = linvfs_permission+0x1c
entry to xfs_create i_count = 2
  cpu = 0 pid = 3375 flag = 0x0 <>   ra = linvfs_mknod+0x12f
entry to xfs_access i_count = 2
  cpu = 1 pid = 3373 flag = 0x0 <>   ra = linvfs_permission+0x1c
entry to xfs_access i_count = 2
  cpu = 0 pid = 3368 flag = 0x0 <>   ra = linvfs_permission+0x1c
entry to xfs_access i_count = 2
  cpu = 0 pid = 3376 flag = 0x0 <>   ra = linvfs_permission+0x1c
entry to xfs_access i_count = 2
  cpu = 1 pid = 3365 flag = 0x0 <>   ra = linvfs_permission+0x1c
entry to xfs_access i_count = 2
  cpu = 1 pid = 3370 flag = 0x0 <>   ra = linvfs_permission+0x1c
entry to xfs_access i_count = 2
  cpu = 1 pid = 3374 flag = 0x0 <>   ra = linvfs_permission+0x1c
[0]more>  
Only 'q' or 'Q' are processed at more prompt, input ignored
entry to xfs_access i_count = 2
  cpu = 0 pid = 3369 flag = 0x0 <>   ra = linvfs_permission+0x1c


Not sure this is the correct vnode though seems to be VDIR....
Comment 8 Greg Dickie 2004-09-10 13:34:27 CDT
output from xtp on failed transaction:

[1]kdb> xtp 0xef0d7300
tp 0xef0d7300 type CREATE mount 0xeab14800
flags xtp 0x7 <dirty sb_dirty perm_log_res > 
callback 0xef0d7304 forw 0x00000000 back 0x00000000
log res 157880 block res 40 block res used 0
rt res 0 rt res used 0
ticket 0xe8310288 lsn [0:0] commit_lsn [0:0]
callback 0x00000000 callarg 0x00000000
icount delta 0 ifree delta -1
blocks delta 0 res blocks delta 0
rt delta 0 res rt delta 0
ag freeblks delta 0 ag flist delta 0 ag btree delta 0
dblocks delta 0 agcount delta 0 imaxpct delta 0
rextsize delta 0 rbmblocks delta 0
rblocks delta 0 rextents delta 0 rextslog delta 0
dqinfo 0x00000000
log items:

chunk 0 index 0 item 0xe851b364 size 0
flags lic 0x1 <dirty > 
type buf mountp 0xeab14800 flags log 0x1 <in ail > 
ail forw 0xe851b2d0 ail back 0xe92bd678 lsn [10:3995] desc ef0d73c0 ops 0xc0440200
iodonefunc &0xc01cba90
[1]more> 
buf 0xeaf43e80 recur 0 refcount 1 flags:dirty logged 
size 2 blkno 0x2 len 0x1 map size 1 map 0xe851b3b4
blf flags: 

chunk 0 index 1 item 0xe851b2d0 size 0
flags lic 0x1 <dirty > 
type buf mountp 0xeab14800 flags log 0x1 <in ail > 
ail forw 0xe8b6cbf0 ail back 0xe851b364 lsn [10:3995] desc ef0d73c8 ops 0xc0440200
iodonefunc &0xc01cba90
buf 0xeaf5b680 recur 0 refcount 1 flags:dirty logged 
size 2 blkno 0x18 len 0x8 map size 2 map 0xe851b320
blf flags: 
log busy free 31, list:
Chunk 0 at 0xef0d744c next 0x00000000 free 0x7fffffff unused 0
  00: ag 0 idx 0
  01: ag 0 idx 0
  02: ag 0 idx 0
  03: ag 0 idx 0
  04: ag 0 idx 0
  05: ag 0 idx 0
  06: ag 0 idx 0
  07: ag 0 idx 0
  08: ag 0 idx 0
[1]more> 
  09: ag 0 idx 0
  10: ag 0 idx 0
  11: ag 0 idx 0
  12: ag 0 idx 0
  13: ag 0 idx 0
  14: ag 0 idx 0
  15: ag 0 idx 0
  16: ag 0 idx 0
  17: ag 0 idx 0
  18: ag 0 idx 0
  19: ag 0 idx 0
  20: ag 0 idx 0
  21: ag 0 idx 0
  22: ag 0 idx 0
  23: ag 0 idx 0
  24: ag 0 idx 0
  25: ag 0 idx 0
  26: ag 0 idx 0
  27: ag 0 idx 0
  28: ag 0 idx 0
  29: ag 0 idx 0
  30: ag 0 idx 0
Comment 9 Stephane Harnois 2004-09-10 15:19:07 CDT
reproduced the bug on the cvs tree from september 10 at 11:35.  Time to failure
running the ci_wrap over localhost nfs mount was 30 minutes using 4 mount moints.  
Comment 10 Russell Cattelan 2004-09-13 08:42:57 CDT

*** This bug has been marked as a duplicate of 309 ***
Comment 11 Greg Dickie 2004-09-13 14:31:54 CDT
Created attachment 139 [details]
Patch to cover up race in xfs_iget

This patch appears to address the original problem that we hit with EIO getting
returned from xfs_iget after is_bad_inode. We have a similar patch for retrying
to address the "ambiguous vns" but since Russell thinks its the same problem as
309 I won't post it here to cause confusion.