xfs
[Top] [All Lists]

[Bug 359] New: apparent race condition with NFS causes xfs_forced_shutdo

To: xfs-master@xxxxxxxxxxx
Subject: [Bug 359] New: apparent race condition with NFS causes xfs_forced_shutdown
From: bugzilla-daemon@xxxxxxxxxxx
Date: Thu, 2 Sep 2004 06:32:53 -0700
Sender: linux-xfs-bounce@xxxxxxxxxxx
http://oss.sgi.com/bugzilla/show_bug.cgi?id=359

           Summary: apparent race condition with NFS causes
                    xfs_forced_shutdown
           Product: Linux XFS
           Version: 1.2.x
          Platform: IA32
        OS/Version: Linux
            Status: NEW
          Severity: major
          Priority: High
         Component: XFS kernel code
        AssignedTo: xfs-master@xxxxxxxxxxx
        ReportedBy: greg@xxxxxxxxx


At a customer site they were experiencing periodic xfs_forced_shutdowns
(in-memory corruption detected). After eliminating hardware as a possibility we
started looking at software causes.
The configuration is as follows:
SMP 2x2.4GHz CPU Dell PE4600 server connected to Adaptec Sanbloc RAIDs
kernel is a patched up 2.4.20+contemporary XFS (1.2-ish). 

With some investigation our customer found that 2 NFS clients moving the same
filename around causes the crash

After sprinkling some printks in the kernel it appears that the is_bad_inode
check in xfs_iget is failing and returning EIO

An example of one of these backtraces is:

Sep  2 08:13:23 sh15 kernel: xfs_force_shutdown(lvm(58,0),0x8) called from line
1051 of file xfs_trans.c.  Return address = 0xc01ff9d9
Sep  2 08:13:23 sh15 kernel: XFS: Transforming an alert into a BUG.
Sep  2 08:13:23 sh15 kernel: Filesystem "lvm(58,0)": Corruption of in-memory
data detected.  Shutting down filesystem: lvm(58,0)
Sep  2 08:13:23 sh15 kernel: kernel BUG at debug.c:126!
Sep  2 08:13:23 sh15 kernel: invalid operand: 0000
Sep  2 08:13:23 sh15 kernel: dvsdriver esm e1000 tg3 e100 bonding usb-ohci
usbcore lvm-mod mptscsih mptctl isense mptbase rtc
Sep  2 08:13:23 sh15 kernel: CPU:    0
Sep  2 08:13:23 sh15 kernel: EIP:    0010:[<c0215d15>]    Tainted: P
Sep  2 08:13:23 sh15 kernel: EFLAGS: 00010246
Sep  2 08:13:23 sh15 kernel: EIP is at icmn_err+0x85/0x95 [kernel]
Sep  2 08:13:23 sh15 kernel: eax: 00000067   ebx: 00000000   ecx: 00000001  
edx: c0445414
Sep  2 08:13:23 sh15 kernel: esi: c037d161   edi: c03511b0   ebp: ed62bce4  
esp: ed62bcd4
Sep  2 08:13:23 sh15 kernel: ds: 0018   es: 0018   ss: 0018
Sep  2 08:13:23 sh15 kernel: Process nfsd (pid: 1326, stackpage=ed62b000)
Sep  2 08:13:23 sh15 kernel: Stack: 00000293 ea423580 0000005e c035e320 ed62bd1c
c01e6254 00000000 ea423580
Sep  2 08:13:23 sh15 kernel:        ed62bd58 ea423580 c035076e eee0be80 c035e320
0000005e 00000001 c035e320
Sep  2 08:13:23 sh15 kernel:        00000000 00000008 ed62bd3c c01e62e1 00000000
eee5b400 c035e320 ed62bd58
Sep  2 08:13:23 sh15 kernel: Call Trace:
Sep  2 08:13:23 sh15 kernel:  [<c01e6254>] xfs_fs_vcmn_err+0x54/0x70 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c01e62e1>] xfs_cmn_err+0x51/0x60 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c02099a0>] xfs_do_force_shutdown+0xc0/0xe0 
[kernel]
Sep  2 08:13:23 sh15 kernel:  [<c01ff9d9>] xfs_trans_cancel+0x59/0xd0 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0205efb>] xfs_create+0x57b/0x620 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0211cff>] linvfs_mknod+0x12f/0x260 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0211e46>] linvfs_create+0x16/0x20 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c014936e>] vfs_create+0x11e/0x180 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0198932>] nfsd_create_v3+0x292/0x400 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c019d904>] nfsd3_proc_create+0x144/0x160 
[kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0193f58>] nfsd_dispatch+0xb8/0x17c [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c032329b>] svc_process+0x2cb/0x560 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0193d39>] nfsd+0x239/0x3a0 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0193b00>] nfsd+0x0/0x3a0 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0107b96>] kernel_thread+0x26/0x40 [kernel]
Sep  2 08:13:23 sh15 kernel:  [<c0193b00>] nfsd+0x0/0x3a0 [kernel]
Sep  2 08:13:23 sh15 kernel:
Sep  2 08:13:23 sh15 kernel: Code: 0f 0b 7e 00 b4 11 35 c0 8d 65 f4 5b 5e 5f 5d
c3 80 3d 84 d4



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


<Prev in Thread] Current Thread [Next in Thread>
  • [Bug 359] New: apparent race condition with NFS causes xfs_forced_shutdown, bugzilla-daemon <=