[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Oopses in kfree



Steve,
> 
> On Thu, 2002-02-14 at 04:40, Ian D. Hardy wrote:
> > Steve+
> > 
> > I enabled the 'CONFIG_DEBUG_SLAB' option in the kernel (taking a recent
> > CVS of the 2.4.17 XFS from 12th Feb) and have had the following Oops (which I
> > hope means more to you than to me!).
> > 
> > 
> > Feb 14 00:41:31 blue00 kernel: kfree: bad ptr f8f3d000h.
> > Feb 14 00:41:31 blue00 kernel: invalid operand: 0000 
> > Feb 14 00:41:32 blue00 kernel: CPU:    1 
> > Feb 14 00:41:32 blue00 kernel: EIP:    0010:[kmem_cache_free+54/128]    Not
> > tainted 
> > Feb 14 00:41:32 blue00 kernel: EFLAGS: 00010086 
> > Feb 14 00:41:32 blue00 kernel: eax: 0000001d   ebx: 00e3cf40   ecx: 0000002e  
> > edx: 00000000 
> > Feb 14 00:41:32 blue00 kernel: esi: d42520e4   edi: f8f3d000   ebp: 00000000  
> > esp: f7ee1e30 
> > Feb 14 00:41:32 blue00 kernel: ds: 0018   es: 0018   ss: 0018 
> > Feb 14 00:41:32 blue00 kernel: Process kswapd (pid: 5, stackpage=f7ee1000) 
> > Feb 14 00:41:32 blue00 kernel: Stack: c02b3322 f8f3d000 d4252130 d42520e4
> > 00000000 00000000 00000286 c74bfecc  
> > Feb 14 00:41:32 blue00 kernel:        c01f6f86 f8f3d000 c01cabfe f8f3d000
> > 00014460 d42520e4 d42520e4 00000000  
> > Feb 14 00:41:32 blue00 kernel:        c01cac6f d42520e4 00000000 d42520e4
> > c01c78ae d42520e4 00000001 c01e649a  
> > Feb 14 00:41:32 blue00 kernel: Call Trace: [change_termios+118/400]
> > [xlog_recover_do_efi_trans+158/192] [xlog_recover_do_efd_trans+79/256]
> > [xlog_regrant_write_log_space+94/784] [linvfs_follow_link+10/240]  
> > Feb 14 00:41:32 blue00 kernel: Code: 0f 0b 83 c4 08 8b 15 8c 85 3f c0 8b 2c 1a
> > 89 7c 24 14 b8 00  
> > Using defaults from ksymoops -t elf32-i386 -a i386
> > 
> > Code;  00000000 Before first symbol
> > 00000000 <_EIP>:
> > Code;  00000000 Before first symbol
> >    0:   0f 0b                     ud2a   
> > Code;  00000002 Before first symbol
> >    2:   83 c4 08                  add    $0x8,%esp
> > Code;  00000004 Before first symbol
> >    5:   8b 15 8c 85 3f c0         mov    0xc03f858c,%edx
> > Code;  0000000a Before first symbol
> >    b:   8b 2c 1a                  mov    (%edx,%ebx,1),%ebp
> > Code;  0000000e Before first symbol
> >    e:   89 7c 24 14               mov    %edi,0x14(%esp,1)
> > Code;  00000012 Before first symbol
> >   12:   b8 00 00 00 00            mov    $0x0,%eax
> > 
> > 
> > Ian
> > 
> 
> Well, it has bits of interesting stack in there. Were you running on
> the filesystem at the time, or attempting to mount it?
> 
> Also, do you have some very large complex directories, or very large
> files out there?
> 
> Steve
> 

At the time of the above crash the system had been running for ~14hours
with all file systems mounted. There is a single XFS filesystem (/scratch),
a number of 'efs2' filesystems (/, /boot, /usr, /var) and a 'reiserfs'
system (/usr/local). The XFS filesystem is ~560Gbytes in size and is ~90%
full, there's a small number of ~2Gbyte files on the file system. The 
majority of the IO activity is NFS access to/from the XFS filesystem
(/scratch), there was also a Legato Networker backup session doing a
full backup of /scratch at the time of the crash (the last few crashes
have been during full backups, though this is not always the case - I'm
inclined to think that it is simply likely to fail during a backup
due to the increased filesystem activity).

There's nothing that I can see strange in the directory structure.

(/scratch is implemented on a GForce RI hardware RAID (5) unit that is
connected to the server via a QLogic QLA2200 FC HBA).

Ian

-- 

////////////////////////////////////////////////////////////////////////////
Ian Hardy                                   
Research Services                          
Computing Services                          email: idh@soton.ac.uk    
Southampton University                             i.d.hardy@soton.ac.uk
Southampton  S017 1BJ, UK.
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\