[Top] [All Lists]

RE: 2.4.9 is bad

To: "'jacob berkman'" <jacob@xxxxxxxxxx>, Eric Sandeen <sandeen@xxxxxxx>
Subject: RE: 2.4.9 is bad
From: "Mostek, Jim" <JMostek@xxxxxxxxxxx>
Date: Thu, 4 Oct 2001 13:06:31 -0500
Cc: Florin Andrei <florin@xxxxxxx>, linux-xfs <linux-xfs@xxxxxxxxxxx>
Sender: owner-linux-xfs@xxxxxxxxxxx

We've been seeing some corruption with 2.4.9 without XFS.

The last one I looked closely at we have a thread that has just allocated
an inode, alloc_inode, which gets it from the inode_cachep kmem_cache_t.
The inode is within a certain
page that is full of characters that were written by syslog to /var/log/messages.

I chased the inode_cachep slab_t structures and there is one next pointer
that points to the start of a page (the next page after the one the inode
is in). This is wrong as these pointers are offset from the start of a page.
I followed all the prev pointers and all the slab_t's are correct
and I can see where the bad next pointer is. For this problem, many
fields in the inode are OK but the dentry list is bad. Oopsed  in d_instantiate.

We have had a few scatterred oopses for a few releases (2.4.7, 2.4.8, and
now 2.4.9). This is the first one I really chased down in detail to see
that it looked like something went wrong in the inode_cachep. I'm wondering
if there isn't a bug somewhere in the way the slabs are freed (if all elements
are no longer available) racing with a corresnding allocate. Or, maybe someone
freed an inode twice or ...

Anyway, just chimming in that we are seeing memory corruption on 2.4.9, too.


-----Original Message-----
From: jacob berkman [mailto:jacob@xxxxxxxxxx]
Sent: Thursday, October 04, 2001 12:30 PM
To: Eric Sandeen
Cc: Florin Andrei; linux-xfs
Subject: Re: 2.4.9 is bad

On Mon, 2001-10-01 at 18:10, Eric Sandeen wrote:
> Florin Andrei wrote:
> >
> > Looks like there are some serious problems with 2.4.9
> > This is what i get from a system running XFS-1.0.1 on linux-2.4.9, RAID
> > hardware (DAC960):
> >
> > xfs_force_shutdown(dac960(48,4),0x8) called from line 4072 of file
> > xfs_bmap.c.  Return address = 0xc01b8b9c
> > Corruption of in-memory data detected.  Shutting down filesystem:
> > dac960(48,4)
> > Please umount the filesystem, and rectify the problem(s)
> If anyone else is experiencing these "Corruption of in-memory data detected"
> errors, please let me know

like i had said in an earlier mail (2 or 3 weeks ago), i had also gotten
this (on my root inode it appears) and have subsequently lost that
partition to lost+found.


<Prev in Thread] Current Thread [Next in Thread>