xfs
[Top] [All Lists]

RE: crash in linvfs_dentry_to_fh

To: "'linux-xfs@xxxxxxxxxxx'" <linux-xfs@xxxxxxxxxxx>
Subject: RE: crash in linvfs_dentry_to_fh
From: "HABBINGA,ERIK (HP-Loveland,ex1)" <erik.habbinga@xxxxxx>
Date: Mon, 31 Mar 2003 10:21:14 -0800
Sender: linux-xfs-bounce@xxxxxxxxxxx
Steve,

The kernel I'm running does have a patch to drop the BKL around calls to
xfs_permission, xfs_lookup, and xfs_readdir, based on the philosophy of a
post of yours from Feb 2002:

http://lists.insecure.org/linux-kernel/2002/Feb/4308.html

permission, lookup, and readdir were the biggest hitters of BKL usage when
profiling the SPEC SFS NFS test last year.  I don't know if not having the
BKL in the lookup code is the problem here, as we crash out of the lookup
code in fh_compose.  I don't drop the BKL before entering
linvfs_dentry_to_fh.  The stack trace from the oops isn't quite correct, the
call tree doesn't go through lookup_hash before getting to
linvfs_dentry_to_fh, it should go:

nfsd_lookup -> fh_compose -> linvfs_dentry_to_fh

I'll instrument linvfs_fh_to_dentry and xfs_inactive like you suggest and
let you know the results.  Let me know if you think dropping the BKL around
permission, lookup, and readdir might be causing this problem.

Erik


> Well, looking at how NFS uses the call, and what we do inside it, I
> would say there is a chance the inode is being torn down on another 
> cpu at the same time. Except there is reference on the dentry here
> and it does not call this function for negative dentry. This
> does indeed look like a partially initialized or torn down
> inode. The xfs_inactive crash looks a little similar. If you can
> instrument up linvfs_dentry_to_fh and dump the vnode contents
> when this happens it might show us something. So adding an explicit
> check for the pointer being null would be the thing to do.
> Not sure I can suggest a similar thing to try in the inactive
> path.
> 
> Steve


>  -----Original Message-----
> From:         HABBINGA,ERIK (HP-Loveland,ex1)  
> Sent: Friday, March 28, 2003 3:28 PM
> To:   'linux-xfs@xxxxxxxxxxx'
> Subject:      crash in linvfs_dentry_to_fh
> 
> I get the following crash in linvfs_dentry_to_fh after 
> pushing a server very hard with the SPEC SFS NFS test.  It's 
> a similar crash to the xfs_inactive crash I mentioned earlier 
> in the week.  Not always repeatable, but we've seen it a few times:
> 
> ksymoops 2.4.5 on i686 2.4.18-14.  Options used
>      -V (default)
>      -K (specified)
>      -L (specified)
>      -O (specified)
>      -m System.map (specified)
> 
> Unable to handle kernel NULL pointer dereference at virtual 
> address 00000008
>  printing eip:
> 801d9578
> *pde = 72da5001
> Oops: 0000
> CPU:    0
> EIP:    0010:[<801d9578>]    Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EFLAGS: 00010202
> eax: 00000000   ebx: 0000000d   ecx: f508f100   edx: 80336b20
> esi: f2121ec4   edi: f4cb64a4   ebp: f2121f04   esp: f2121eb4
> ds: 0018   es: 0018   ss: 0018
> Process nfsd (pid: 4074, stackpage=f2121000)
> Stack: f4cb6494 f2093000 f4cb64a4 94d29220 fffffff4 94d29220 
> f508f0e0 8014236d
>        8016b515 94d29220 f4cb64a4 f2121f04 00000001 94d295a0 
> 94d295a0 0000000d
>        f4cb6404 80336b20 f2121f04 f508f100 0000000d 8016bf65 
> f4cb6494 f2093000
> Call Trace: [<8014236d>]  [<8016b515>]  [<8016bf65>]  
> [<802c2a24>]  [<80171e58>]
>   [<80168eb3>]  [<802c2635>]  [<80168c67>]  [<80105694>]
> 
> Code: 8b 50 08 56 8b 41 f4 50 8b 42 50 ff d0 8b 44 24 20 89 07 8b
> 
> 
> >>EIP; 801d9578 <linvfs_dentry_to_fh+2c/ac>   <=====
> 
> >>ecx; f508f100 <END_OF_CODE+74c8d37c/????>
> >>edx; 80336b20 <linvfs_sops+0/60>
> >>esi; f2121ec4 <END_OF_CODE+71d20140/????>
> >>edi; f4cb64a4 <END_OF_CODE+748b4720/????>
> >>ebp; f2121f04 <END_OF_CODE+71d20180/????>
> >>esp; f2121eb4 <END_OF_CODE+71d20130/????>
> 
> Trace; 8014236d <lookup_hash+ad/100>
> Trace; 8016b515 <fh_compose+265/310>
> Trace; 8016bf65 <nfsd_lookup+439/46c>
> Trace; 802c2a24 <svc_sock_enqueue+184/1f8>
> Trace; 80171e58 <nfsd3_proc_lookup+d4/e0>
> Trace; 80168eb3 <nfsd_dispatch+cf/196>
> Trace; 802c2635 <svc_process+29d/4f4>
> Trace; 80168c67 <nfsd+227/3a4>
> Trace; 80105694 <kernel_thread+28/38>
> Code;  801d9578 <linvfs_dentry_to_fh+2c/ac>
> 00000000 <_EIP>:
> Code;  801d9578 <linvfs_dentry_to_fh+2c/ac>   <=====
>    0:   8b 50 08                  mov    0x8(%eax),%edx   <=====
> Code;  801d957b <linvfs_dentry_to_fh+2f/ac>
>    3:   56                        push   %esi
> Code;  801d957c <linvfs_dentry_to_fh+30/ac>
>    4:   8b 41 f4                  mov    0xfffffff4(%ecx),%eax
> Code;  801d957f <linvfs_dentry_to_fh+33/ac>
>    7:   50                        push   %eax
> Code;  801d9580 <linvfs_dentry_to_fh+34/ac>
>    8:   8b 42 50                  mov    0x50(%edx),%eax
> Code;  801d9583 <linvfs_dentry_to_fh+37/ac>
>    b:   ff d0                     call   *%eax
> Code;  801d9585 <linvfs_dentry_to_fh+39/ac>
>    d:   8b 44 24 20               mov    0x20(%esp,1),%eax
> Code;  801d9589 <linvfs_dentry_to_fh+3d/ac>
>   11:   89 07                     mov    %eax,(%edi)
> Code;  801d958b <linvfs_dentry_to_fh+3f/ac>
>   13:   8b 00                     mov    (%eax),%eax
> 
> The code in question is derefencing the vp->v_bh pointer to 
> get the vp->v_bh.bh_first->bd_ops (also known as vp->v_fops) 
> pointer in preparation for getting the vop_fid2 function 
> pointer.  vp->v_bh is NULL, which causes the crash.
> 
> Dissassembly of linvfs_dentry_to_fh:
> /src/kernel/linux/fs/xfs/linux/xfs_super.c:724
> 
>         VOP_FID2(vp, (struct fid *)&fid, error);
> 801d9571:       8b 41 f4                mov    0xfffffff4(%ecx),%eax
> 801d9574:       8d 74 24 10             lea    0x10(%esp,1),%esi
> 801d9578:       8b 50 08                mov    0x8(%eax),%edx
> 
> We're running 2.4.20 with XFS CVS from March 17th.  
> 
> Erik Habbinga
> Hewlett Packard
> 


<Prev in Thread] Current Thread [Next in Thread>