> Hi Steve,
>
> looking over LK, i've found an interesting thread i'm concerned
> with (NFS related Oops in 2.4.[39]-xfs).
>
> Last weekend our main file server oopsed during the weekly run of
> amanda (doing xfsdump):
>
> embeh [09:39am] (0.12) ~> uname -a
> Linux embeh.sif.it 2.4.9-xfs-1 #1 Tue Aug 28 11:18:44 CEST 2001 i686
>
> compiled with kgcc.
>
> That's the oops:
>
> Oct 13 00:50:38 embeh kernel: Unable to handle kernel NULL pointer
> dereference at virtual add
> Oct 13 00:50:38 embeh kernel: 00000000
> Oct 13 00:50:38 embeh kernel: *pde = 00000000
> Oct 13 00:50:38 embeh kernel: Oops: 0000
> Oct 13 00:50:38 embeh kernel: CPU: 0
> Oct 13 00:50:38 embeh kernel: EIP: 0010:[agp_frontend_cleanup+0/96]
> Oct 13 00:50:38 embeh kernel: EIP: 0010:[<00000000>]
> Using defaults from ksymoops -t elf32-i386 -a i386
> Oct 13 00:50:38 embeh kernel: EFLAGS: 00010286
> Oct 13 00:50:38 embeh kernel: eax: 00000000 ebx: de666540 ecx:
> de666f1c edx: c04c6700
> Oct 13 00:50:38 embeh kernel: esi: de666ec0 edi: de666540 ebp:
> de666540 esp: ddcfbe58
> Oct 13 00:50:38 embeh kernel: ds: 0018 es: 0018 ss: 0018
> Oct 13 00:50:38 embeh kernel: Process nfsd (pid: 734, stackpage=ddcfb000)
> Oct 13 00:50:38 embeh kernel: Stack: c0187284 d42b8cc0 de666ec0 00000002
> decf3c00 c0187696 de
> 666540 00000002
> Oct 13 00:50:38 embeh kernel: ddd06400 ddd06804 ddd49800 00000000
> decf3de8 ddd06400 c0
> 1879d2 decf3c00
> Oct 13 00:50:38 embeh kernel: ddd06814 00000002 00000001 00000001
> 00000007 00000007 cc
> ffad28 ddd06400
> Oct 13 00:50:38 embeh kernel: Call Trace: [nfsd_findparent+52/224]
> [find_fh_dentry+534/816] [
> Oct 13 00:50:38 embeh kernel: Call Trace: [<c0187284>] [<c0187696>]
> [<c01879d2>] [<c0188332>]
> [<c010ec49>]
> Oct 13 00:50:38 embeh kernel: [<c018e34e>] [<c0190130>] [<c0185863>]
> [<c0329328>] [<c01856
> Oct 13 00:50:38 embeh kernel: Code: Bad EIP value.
>
> >>EIP; 00000000 Before first symbol
> Trace; c0187284 <nfsd_findparent+34/e0>
> Trace; c0187696 <find_fh_dentry+216/330>
> Trace; c01879d2 <fh_verify+222/450>
> Trace; c0188332 <nfsd_lookup+92/4c0>
> Trace; c010ec49 <schedule+2d9/420>
> Trace; c018e34e <nfsd3_proc_lookup+13e/150>
> Trace; c0190130 <nfs3svc_decode_diropargs+a0/110>
> Trace; c0185863 <nfsd_dispatch+d3/170>
> Trace; c0329328 <svc_process+2a8/540>
>
>
> Athlon 1.33 Ghz, LVM over RAID5, userland and kernel from CVS as of
> 2001-08-28, xfsdump-1.1.3-0
>
> hope this helps,
Well, it lands the ball pretty much in our court I guess, so yes it
helps. I am starting to form a theory on what is going on here, but
there is certainly a problem which means xfsdump and an nfs server
can stamp on each other if NFS happens to go for an inode which
xfsdump recently accessed.
Steve
>
> -m
>
|