All,
Well, thanks to everyone that commented on the status of my
subscription, I'd forgotten about the bounce-removal option. Anyways,
I've had some problems w/ XFS and NFS on a few of our systems here
(running the 2.4.9 RH + XFS kernel from release 1.0.1 of XFS). Here's
the dmesg output from today's crash :
<snip, snip>
Unable to handle kernel paging request at virtual address 00003000
printing eip:
c01e196a
*pde = 36330067
*pte = 00000000
Oops: 0002
CPU: 1
EIP: 0010:[<c01e196a>] Not tainted
EFLAGS: 00013206
eax: 00800000 ebx: 04000000 ecx: 007ff400 edx: 02000000
esi: fae06000 edi: 00003000 ebp: 02000000 esp: f65ff7b8
ds: 0018 es: 0018 ss: 0018
Process nfsd (pid: 1111, stackpage=f65ff000)
Stack: d052ee38 fae03000 02000010 04000000 00000000 c01b9111 fae03000
04000000
02000000 00000001 00000001 d052ee38 00000030 f65ffb74 02000010
c018fcc3
d052edec 00000001 00000000 d052edec f65ffa14 00000030 f65ffb74
00000000
Call Trace: [<c01b9111>] xfs_iext_realloc [kernel] 0xf1
[<c018fcc3>] xfs_bmap_insert_exlist [kernel] 0x27
[<c018b931>] xfs_bmap_add_extent_hole_delay [kernel] 0x491
[<c01884c3>] xfs_bmap_add_extent [kernel] 0x157
[<c01c601f>] xfs_mod_incore_sb [kernel] 0x2b
[<c0192300>] xfs_bmapi [kernel] 0xac8
[<c0191838>] xfs_bmapi [kernel] 0x0
[<c0190499>] xfs_bmap_search_extents [kernel] 0x4d
[<c01deae3>] xfs_iomap_write_delay [kernel] 0x647
[<c0191838>] xfs_bmapi [kernel] 0x0
[<c025a2e6>] ip_rcv [kernel] 0x3e6
[<c01bafa0>] xfs_size_fn [kernel] 0x0
[<c01bafa0>] xfs_size_fn [kernel] 0x0
[<c01bafa0>] xfs_size_fn [kernel] 0x0
[<c01de150>] xfs_iomap_read [kernel] 0x150
[<c025a32c>] ip_local_deliver_finish [kernel] 0x0
[<c024f98f>] nf_hook_slow [kernel] 0x11f
[<c0199f23>] xfs_bmbt_get_state [kernel] 0x33
[<c019036c>] xfs_bmap_do_search_extents [kernel] 0x2e4
[<c01de22d>] xfs_iomap_write [kernel] 0xb1
[<c01dd2d6>] xfs_bmap [kernel] 0x136
[<c01dbb53>] linvfs_pb_bmap [kernel] 0x7b
[<c0172811>] _pagebuf_file_write [kernel] 0xf5
[<c0172a13>] pagebuf_generic_file_write [kernel] 0xbf
[<c01dbad8>] linvfs_pb_bmap [kernel] 0x0
[<c01dcf3f>] xfs_write [kernel] 0x2df
[<c01dbad8>] linvfs_pb_bmap [kernel] 0x0
[<c01d8398>] linvfs_write [kernel] 0x2f4
[<f8b8e2e8>] nfsd_write [nfsd] 0x140
[<c027d446>] inet_sendmsg [kernel] 0x3a
[<c0117b17>] reschedule_idle [kernel] 0x23f
[<f8b93533>] nfsd3_proc_write [nfsd] 0x12b
[<f8b9b360>] nfsd_procedures3 [nfsd] 0xe0
[<f8b8a5e3>] nfsd_dispatch [nfsd] 0xd3
[<f8b9b360>] nfsd_procedures3 [nfsd] 0xe0
[<f8b61c38>] svc_process_Rsmp_b86ccc64 [sunrpc] 0x2ac
[<f8b9b220>] nfsd_svcstats [nfsd] 0x0
[<f8b9acf8>] nfsd_version3 [nfsd] 0x0
[<f8b8a349>] nfsd [nfsd] 0x1b9
[<c0105813>] kernel_thread [kernel] 0x23
Code: f3 a5 f6 c2 02 74 02 66 a5 f6 c2 01 74 01 a4 55 8b 4c 24 1c
</end>
and here's the output from yesterday's crash :
<snip,snip>
Unable to handle kernel NULL pointer dereference at virtual address
00000000
printing eip:
c01e196a
*pde = 00000000
Oops: 0002
CPU: 1
EIP: 0010:[<c01e196a>] Not tainted
EFLAGS: 00010206
eax: 00800000 ebx: 04000000 ecx: 00800000 edx: 02000000
esi: faf05000 edi: 00000000 ebp: 02000000 esp: f666b7b8
ds: 0018 es: 0018 ss: 0018
Process nfsd (pid: 877, stackpage=f666b000)
Stack: e912c300 faf05000 02000010 04000000 00000000 c01b9111 faf05000
04000000
02000000 00000001 00000001 e912c300 00000030 f666bb74 02000010
c018fcc3
e912c2b4 00000001 00000000 e912c2b4 f666ba14 00000030 f666bb74
00000000
Call Trace: [<c01b9111>] xfs_iext_realloc [kernel] 0xf1
[<c018fcc3>] xfs_bmap_insert_exlist [kernel] 0x27
[<c018b931>] xfs_bmap_add_extent_hole_delay [kernel] 0x491
[<c01884c3>] xfs_bmap_add_extent [kernel] 0x157
[<c01c601f>] xfs_mod_incore_sb [kernel] 0x2b
[<c0192300>] xfs_bmapi [kernel] 0xac8
[<c0191838>] xfs_bmapi [kernel] 0x0
[<c0190499>] xfs_bmap_search_extents [kernel] 0x4d
[<c01deae3>] xfs_iomap_write_delay [kernel] 0x647
[<c0191838>] xfs_bmapi [kernel] 0x0
[<c01bafa0>] xfs_size_fn [kernel] 0x0
[<c01bafa0>] xfs_size_fn [kernel] 0x0
[<c01bafa0>] xfs_size_fn [kernel] 0x0
[<c01de150>] xfs_iomap_read [kernel] 0x150
[<f8b402ca>] svc_udp_data_ready [sunrpc] 0x5e
[<c0134690>] reclaim_page [kernel] 0x364
[<c01de22d>] xfs_iomap_write [kernel] 0xb1
[<c01dd2d6>] xfs_bmap [kernel] 0x136
[<c01dbb53>] linvfs_pb_bmap [kernel] 0x7b
[<c0172811>] _pagebuf_file_write [kernel] 0xf5
[<c0172a13>] pagebuf_generic_file_write [kernel] 0xbf
[<c01dbad8>] linvfs_pb_bmap [kernel] 0x0
[<c01dcf3f>] xfs_write [kernel] 0x2df
[<c01dbad8>] linvfs_pb_bmap [kernel] 0x0
[<c01d8398>] linvfs_write [kernel] 0x2f4
[<f8b7c2e8>] nfsd_write [nfsd] 0x140
[<c027d446>] inet_sendmsg [kernel] 0x3a
[<c0117b17>] reschedule_idle [kernel] 0x23f
[<f8b81533>] nfsd3_proc_write [nfsd] 0x12b
[<f8b89360>] nfsd_procedures3 [nfsd] 0xe0
[<f8b785e3>] nfsd_dispatch [nfsd] 0xd3
[<f8b89360>] nfsd_procedures3 [nfsd] 0xe0
[<f8b3fc38>] svc_process_Rsmp_b86ccc64 [sunrpc] 0x2ac
[<f8b89220>] nfsd_svcstats [nfsd] 0x0
[<f8b88cf8>] nfsd_version3 [nfsd] 0x0
[<f8b78349>] nfsd [nfsd] 0x1b9
[<c0105813>] kernel_thread [kernel] 0x23
Code: f3 a5 f6 c2 02 74 02 66 a5 f6 c2 01 74 01 a4 55 8b 4c 24 1c
</end>
the NULL pointer at the top might be from our tape drive, but I'm not
sure on that. We've experienced problems in the past w/ the 2.4.14
(from XFS 1.0.2 release, but used w/ the 1.0.1 release RPMs, which might
be a problem) XFS kernel, unfortunately don't have any output from that
problem.
The first crash is from a machine using disk local to the server (i.e.
sits on a ServeRAID controller w/in the box) w/ approximately 16 clients
accessing the NFS partitions, the 2nd crash is using SAN disk, NFS
mounted out to 32 clients. Any thoughts, comments, suggestions? Thanks
in advance!
Regards,
Derek R.
--
Linux Technician
713-817-1197 (cell)
713-781-4000 x2267 (office)
"Feedback appreciated,
blame shared equally."
|