[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: nfsd & xfsdump strike again




We've also been having problems with xfsdump on our NFS server (1ghz Athlon, 
software raid, 2.4.14 w/ xfs 1.0.2, RedHat 7.1).  We can successfully dump 
smaller (< 1gb) filesystems with xfsdump-1.1.7, but we always generate a 
kernel oops when dumping larger filesystems (~15gb):

Feb 14 12:10:58 winnie kernel: Unable to handle kernel NULL pointer 
dereference at virtual address 00000000
Feb 14 12:10:58 winnie kernel:  printing eip:
Feb 14 12:10:58 winnie kernel: 00000000
Feb 14 12:10:58 winnie kernel: *pde = 00000000
Feb 14 12:10:58 winnie kernel: Oops: 0000
Feb 14 12:10:58 winnie kernel: CPU:    0
Feb 14 12:10:58 winnie kernel: EIP:    0010:[agp_frontend_cleanup+0/96]    Not 
tainted
Feb 14 12:10:58 winnie kernel: EIP:    0010:[<00000000>]    Not tainted
Feb 14 12:10:58 winnie kernel: EFLAGS: 00010282
Feb 14 12:10:58 winnie kernel: eax: 00000000   ebx: c3bc7640   ecx: c3bc789c   
edx: c03b4960
Feb 14 12:10:58 winnie kernel: esi: c3bc7840   edi: c3bc7640   ebp: c3bc7640   
esp: c7045ec4
Feb 14 12:10:58 winnie kernel: ds: 0018   es: 0018   ss: 0018
Feb 14 12:10:58 winnie kernel: Process nfsd (pid: 617, stackpage=c7045000)
Feb 14 12:10:58 winnie kernel: Stack: c016c714 c2194540 c3bc7840 c704d604 
c7dea800 c016cb66 c3bc7640 c704d604
Feb 14 12:10:58 winnie kernel:        00000002 c70d9800 11270000 c036a270 
c7dea9f0 00000002 c016ce89 c7dea800
Feb 14 12:10:58 winnie kernel:        c704d614 00000002 00000001 00000001 
c704d604 c704d490 c704d694 c704d600
Feb 14 12:10:58 winnie kernel: Call Trace: [nfsd_findparent+52/224] 
[find_fh_dentry+502/784] [fh_verify+521/1008] [nfsd3_proc_getattr+149/160] 
[nfsd_dispatch+211/416]
Feb 14 12:10:58 winnie kernel: Call Trace: [<c016c714>] [<c016cb66>] 
[<c016ce89>] [<c01734f5>] [<c016acf3>]
Feb 14 12:10:58 winnie kernel:    [svc_process+664/1328] [nfsd+369/768] 
[kernel_thread+35/48]
Feb 14 12:10:58 winnie kernel:    [<c02e6078>] [<c016aa91>] [<c0105523>] 
Feb 14 12:10:58 winnie kernel: 
Feb 14 12:10:58 winnie kernel: Code:  Bad EIP value.

The oops appears at different points during the dump, that's not reproducible. 
 The previous messages on this topic indicate that it's a problem with nfsd 
and xfsdump 'stomping' on each others resources, which would suggest that the 
timing would be random, but for large drives, probably inevitable.

We've had this same problem with all the versions of the kernel (2.4.2, 2.4.5, 
2.4.14), xfs (1.0.0, 1.0.1, 1.0.2), and xfsdump that we've had installed.

Any suggestions or ideas?

Thanks,
Scott
--
Scott McMillan - smcmilla@northwestern.edu 
Institute for Environmental Catalysis
Department of Chemical Engineering, Northwestern University
http://broadbelt.chem-eng.northwestern.edu/~scott/