We've also been having problems with xfsdump on our NFS server (1ghz Athlon,
software raid, 2.4.14 w/ xfs 1.0.2, RedHat 7.1). We can successfully dump
smaller (< 1gb) filesystems with xfsdump-1.1.7, but we always generate a
kernel oops when dumping larger filesystems (~15gb):
Feb 14 12:10:58 winnie kernel: Unable to handle kernel NULL pointer
dereference at virtual address 00000000
Feb 14 12:10:58 winnie kernel: printing eip:
Feb 14 12:10:58 winnie kernel: 00000000
Feb 14 12:10:58 winnie kernel: *pde = 00000000
Feb 14 12:10:58 winnie kernel: Oops: 0000
Feb 14 12:10:58 winnie kernel: CPU: 0
Feb 14 12:10:58 winnie kernel: EIP: 0010:[agp_frontend_cleanup+0/96] Not
tainted
Feb 14 12:10:58 winnie kernel: EIP: 0010:[<00000000>] Not tainted
Feb 14 12:10:58 winnie kernel: EFLAGS: 00010282
Feb 14 12:10:58 winnie kernel: eax: 00000000 ebx: c3bc7640 ecx: c3bc789c
edx: c03b4960
Feb 14 12:10:58 winnie kernel: esi: c3bc7840 edi: c3bc7640 ebp: c3bc7640
esp: c7045ec4
Feb 14 12:10:58 winnie kernel: ds: 0018 es: 0018 ss: 0018
Feb 14 12:10:58 winnie kernel: Process nfsd (pid: 617, stackpage=c7045000)
Feb 14 12:10:58 winnie kernel: Stack: c016c714 c2194540 c3bc7840 c704d604
c7dea800 c016cb66 c3bc7640 c704d604
Feb 14 12:10:58 winnie kernel: 00000002 c70d9800 11270000 c036a270
c7dea9f0 00000002 c016ce89 c7dea800
Feb 14 12:10:58 winnie kernel: c704d614 00000002 00000001 00000001
c704d604 c704d490 c704d694 c704d600
Feb 14 12:10:58 winnie kernel: Call Trace: [nfsd_findparent+52/224]
[find_fh_dentry+502/784] [fh_verify+521/1008] [nfsd3_proc_getattr+149/160]
[nfsd_dispatch+211/416]
Feb 14 12:10:58 winnie kernel: Call Trace: [<c016c714>] [<c016cb66>]
[<c016ce89>] [<c01734f5>] [<c016acf3>]
Feb 14 12:10:58 winnie kernel: [svc_process+664/1328] [nfsd+369/768]
[kernel_thread+35/48]
Feb 14 12:10:58 winnie kernel: [<c02e6078>] [<c016aa91>] [<c0105523>]
Feb 14 12:10:58 winnie kernel:
Feb 14 12:10:58 winnie kernel: Code: Bad EIP value.
The oops appears at different points during the dump, that's not reproducible.
The previous messages on this topic indicate that it's a problem with nfsd
and xfsdump 'stomping' on each others resources, which would suggest that the
timing would be random, but for large drives, probably inevitable.
We've had this same problem with all the versions of the kernel (2.4.2, 2.4.5,
2.4.14), xfs (1.0.0, 1.0.1, 1.0.2), and xfsdump that we've had installed.
Any suggestions or ideas?
Thanks,
Scott
--
Scott McMillan - smcmilla@xxxxxxxxxxxxxxxx
Institute for Environmental Catalysis
Department of Chemical Engineering, Northwestern University
http://broadbelt.chem-eng.northwestern.edu/~scott/
|