On Tue, 2002-03-19 at 15:14, Ian D. Hardy wrote:
>
> Bad news I'm affraid I've just had what looks like another similar crash.
> XFS 2.4.18 CVS treee as of 13th March +Steve Lords vnode.patch - server had
> been up ~6 days 4hours (which is slightly longer than average but we have
> had ~ 14 days before).
>
> The ksymoops output is as follows:
> invalid operand: 0000
> CPU: 1
> EIP: 0010:[<c0131d00>] Not tainted
> EFLAGS: 00010202
> eax: 00000001 ebx: c16b3d80 ecx: c16b3d80 edx: 00000000
> esi: 00000000 edi: 00000000 ebp: 00000000 esp: f6d61cb4
> ds: 0018 es: 0018 ss: 0018
> Process nfsinvalid operand: 0000
> CPU: 1
> EIP: 0010:[<c0131d00>] Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EFLAGS: 00010202
> eax: 00000001 ebx: c16b3d80 ecx: c16b3d80 edx: 00000000
> esi: 00000000 edi: 00000000 ebp: 00000000 esp: f6d61cb4
> ds: 0018 es: 0018 ss: 0018
> d (pid: 607, stackpage=f6d61000)
> Stack: c16b3d80 00000000 00000000 e1e618e8 c01286a3 c16b3d80 c0128834
> c16b3d80
> c16b3d80 c01323e8 c0128a1c 00000000 f6d61d2c 00000000 e1e618e8
> c16b3d80
> efb57648 f6d60000 00000000 00000001 f6d61d2c 00000000 c0128acd
> 00000000
> Call Trace: [<c01286a3>] [<c0128834>] [<c01323e8>] [<c0128a1c>] [<c0128acd>]
> [<c01f1372>] [<c01f3fed>] [<c01cf79b>] [<c01e6ced>] [<c01e6410>]
> [<c026d014>]
> [<c0265b06>] [<c01f6a6f>] [<c01e6410>] [<c014f4dc>] [<f8d2e973>]
> [<f8d33f7b>]
> [<f8d3b4a0>] [<f8d2b5d3>] [<f8d3b4a0>] [<f8cf6f89>] [<f8d3b400>]
> [<f8d3aed8>]
> [<f8d2b349>] [<c01057eb>]
>
> Code: 0f 0b 8b 43 18 a8 40 74 02 0f 0b 8b 43 18 a8 80 74 02 0f 0b
> Process nfsd (pid: 607, stackpage=f6d61000)
> Stack: c16b3d80 00000000 00000000 e1e618e8 c01286a3 c16b3d80 c0128834
> c16b3d80
> c16b3d80 c01323e8 c0128a1c 00000000 f6d61d2c 00000000 e1e618e8
> c16b3d80
> efb57648 f6d60000 00000000 00000001 f6d61d2c 00000000 c0128acd
> 00000000
> Call Trace: [<c01286a3>] [<c0128834>] [<c01323e8>] [<c0128a1c>] [<c0128acd>]
> [<c01f1372>] [<c01f3fed>] [<c01cf79b>] [<c01e6ced>] [<c01e6410>]
> [<c026d014>]
> [<c0265b06>] [<c01f6a6f>] [<c01e6410>] [<c014f4dc>] [<f8d2e973>]
> [<f8d33f7b>]
> [<f8d3b4a0>] [<f8d2b5d3>] [<f8d3b4a0>] [<f8cf6f89>] [<f8d3b400>]
> [<f8d3aed8>]
> [<f8d2b349>] [<c01057eb>]
> Code: 0f 0b 8b 43 18 a8 40 74 02 0f 0b 8b 43 18 a8 80 74 02 0f 0b
>
> >>EIP; c0131d00 <__free_pages_ok+50/20c> <=====
> Trace; c01286a2 <remove_inode_page+22/30>
> Trace; c0128834 <truncate_complete_page+44/4c>
> Trace; c01323e8 <__free_pages+1c/20>
> Trace; c0128a1c <truncate_list_pages+1e0/22c>
> Trace; c0128acc <truncate_inode_pages+64/9c>
> Trace; c01f1372 <pagebuf_inval+1a/20>
> Trace; c01f3fec <fs_tosspages+28/30>
> Trace; c01cf79a <xfs_itruncate_start+8e/98>
> Trace; c01e6cec <xfs_setattr+8dc/f7c>
> Trace; c01e6410 <xfs_setattr+0/f7c>
> Trace; c026d014 <qdisc_restart+14/178>
> Trace; c0265b06 <dev_queue_xmit+136/308>
> Trace; c01f6a6e <linvfs_setattr+142/168>
> Trace; c01e6410 <xfs_setattr+0/f7c>
> Trace; c014f4dc <notify_change+7c/2a4>
> Trace; f8d2e972 <[nfsd]nfsd_setattr+3ea/524>
> Trace; f8d33f7a <[nfsd]nfsd3_proc_setattr+b6/c4>
> Trace; f8d3b4a0 <[nfsd]nfsd_procedures3+40/2c0>
> Trace; f8d2b5d2 <[nfsd]nfsd_dispatch+d2/19a>
> Trace; f8d3b4a0 <[nfsd]nfsd_procedures3+40/2c0>
> Trace; f8cf6f88 <[sunrpc]svc_process+28c/51c>
> Trace; f8d3b400 <[nfsd]nfsd_svcstats+0/40>
> Trace; f8d3aed8 <[nfsd]nfsd_version3+0/10>
> Trace; f8d2b348 <[nfsd]nfsd+1b8/370>
> Trace; c01057ea <kernel_thread+22/30>
> Code; c0131d00 <__free_pages_ok+50/20c>
> 00000000 <_EIP>:
> Code; c0131d00 <__free_pages_ok+50/20c> <=====
> 0: 0f 0b ud2a <=====
> Code; c0131d02 <__free_pages_ok+52/20c>
> 2: 8b 43 18 mov 0x18(%ebx),%eax
> Code; c0131d04 <__free_pages_ok+54/20c>
> 5: a8 40 test $0x40,%al
> Code; c0131d06 <__free_pages_ok+56/20c>
> 7: 74 02 je b <_EIP+0xb> c0131d0a
> <__free_pages_ok+5a/20c>
> Code; c0131d08 <__free_pages_ok+58/20c>
> 9: 0f 0b ud2a
> Code; c0131d0a <__free_pages_ok+5a/20c>
> b: 8b 43 18 mov 0x18(%ebx),%eax
> Code; c0131d0e <__free_pages_ok+5e/20c>
> e: a8 80 test $0x80,%al
> Code; c0131d10 <__free_pages_ok+60/20c>
> 10: 74 02 je 14 <_EIP+0x14> c0131d14
> <__free_pages_ok+64/20c>
> Code; c0131d12 <__free_pages_ok+62/20c>
> 12: 0f 0b ud2a
OK, this is something else, and I just had a brainwave about your old
oops though, I think you have a really fragmented file out there. Can
you run this on your filesystem:
xfs_db -r /dev/xxx (it can be mounted)
xfs_db: frag -f
and send me the output.
Someone reported a problem in the memory allocation code the other day,
and maybe, just maybe, you are hitting it.
Steve
>
> ---
>
> Hope the above provides some useful hints in debugging this.
>
> Regards and many thanks for looking at this.
>
> Ian Hardy
>
--
Steve Lord voice: +1-651-683-3511
Principal Engineer, Filesystem Software email: lord@xxxxxxx
|