Steve,
>
> On Thu, 2002-02-14 at 04:40, Ian D. Hardy wrote:
> > Steve+
> >
> > I enabled the 'CONFIG_DEBUG_SLAB' option in the kernel (taking a recent
> > CVS of the 2.4.17 XFS from 12th Feb) and have had the following Oops (which
> > I
> > hope means more to you than to me!).
> >
> >
> > Feb 14 00:41:31 blue00 kernel: kfree: bad ptr f8f3d000h.
> > Feb 14 00:41:31 blue00 kernel: invalid operand: 0000
> > Feb 14 00:41:32 blue00 kernel: CPU: 1
> > Feb 14 00:41:32 blue00 kernel: EIP: 0010:[kmem_cache_free+54/128] Not
> > tainted
> > Feb 14 00:41:32 blue00 kernel: EFLAGS: 00010086
> > Feb 14 00:41:32 blue00 kernel: eax: 0000001d ebx: 00e3cf40 ecx:
> > 0000002e
> > edx: 00000000
> > Feb 14 00:41:32 blue00 kernel: esi: d42520e4 edi: f8f3d000 ebp:
> > 00000000
> > esp: f7ee1e30
> > Feb 14 00:41:32 blue00 kernel: ds: 0018 es: 0018 ss: 0018
> > Feb 14 00:41:32 blue00 kernel: Process kswapd (pid: 5, stackpage=f7ee1000)
> > Feb 14 00:41:32 blue00 kernel: Stack: c02b3322 f8f3d000 d4252130 d42520e4
> > 00000000 00000000 00000286 c74bfecc
> > Feb 14 00:41:32 blue00 kernel: c01f6f86 f8f3d000 c01cabfe f8f3d000
> > 00014460 d42520e4 d42520e4 00000000
> > Feb 14 00:41:32 blue00 kernel: c01cac6f d42520e4 00000000 d42520e4
> > c01c78ae d42520e4 00000001 c01e649a
> > Feb 14 00:41:32 blue00 kernel: Call Trace: [change_termios+118/400]
> > [xlog_recover_do_efi_trans+158/192] [xlog_recover_do_efd_trans+79/256]
> > [xlog_regrant_write_log_space+94/784] [linvfs_follow_link+10/240]
> > Feb 14 00:41:32 blue00 kernel: Code: 0f 0b 83 c4 08 8b 15 8c 85 3f c0 8b 2c
> > 1a
> > 89 7c 24 14 b8 00
> > Using defaults from ksymoops -t elf32-i386 -a i386
> >
> > Code; 00000000 Before first symbol
> > 00000000 <_EIP>:
> > Code; 00000000 Before first symbol
> > 0: 0f 0b ud2a
> > Code; 00000002 Before first symbol
> > 2: 83 c4 08 add $0x8,%esp
> > Code; 00000004 Before first symbol
> > 5: 8b 15 8c 85 3f c0 mov 0xc03f858c,%edx
> > Code; 0000000a Before first symbol
> > b: 8b 2c 1a mov (%edx,%ebx,1),%ebp
> > Code; 0000000e Before first symbol
> > e: 89 7c 24 14 mov %edi,0x14(%esp,1)
> > Code; 00000012 Before first symbol
> > 12: b8 00 00 00 00 mov $0x0,%eax
> >
> >
> > Ian
> >
>
> Well, it has bits of interesting stack in there. Were you running on
> the filesystem at the time, or attempting to mount it?
>
> Also, do you have some very large complex directories, or very large
> files out there?
>
> Steve
>
At the time of the above crash the system had been running for ~14hours
with all file systems mounted. There is a single XFS filesystem (/scratch),
a number of 'efs2' filesystems (/, /boot, /usr, /var) and a 'reiserfs'
system (/usr/local). The XFS filesystem is ~560Gbytes in size and is ~90%
full, there's a small number of ~2Gbyte files on the file system. The
majority of the IO activity is NFS access to/from the XFS filesystem
(/scratch), there was also a Legato Networker backup session doing a
full backup of /scratch at the time of the crash (the last few crashes
have been during full backups, though this is not always the case - I'm
inclined to think that it is simply likely to fail during a backup
due to the increased filesystem activity).
There's nothing that I can see strange in the directory structure.
(/scratch is implemented on a GForce RI hardware RAID (5) unit that is
connected to the server via a QLogic QLA2200 FC HBA).
Ian
--
////////////////////////////////////////////////////////////////////////////
Ian Hardy
Research Services
Computing Services email: idh@xxxxxxxxxxx
Southampton University i.d.hardy@xxxxxxxxxxx
Southampton S017 1BJ, UK.
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
|