[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: "Corruption of in-memory data"



On Mon, 20 May 2002, Sidik Isani wrote:

> Hello -
>
>   Thanks again for your help several months ago with xfs_growfs!
>   Now we have a new problem . . .
>   I was trying to resolve the slow performance issue of some of our
>   RAID5+XFS (2.4.16 kernel) by upgrading to 2.4.18 and XFS-1.1, and
>   reformatting with an external log, as suggested in the FAQ.

2.4.18 has a lot better RAID5 performance with an internal log as well.
I also believe that the fixes that went into the CVS tree for the
multiple block sizes support makes the raid5 support better.

I don't think anyone benchmarked this yet but the performance difference
is probably not as large anymore.

>    During resyncing, one of the disks failed and the raid 5 went into
>   degraded mode (no other disks had errors).  After a clean reboot, still
>   running in degraded mode, (shouldn't matter to XFS, but I thought I'd
>   mention it) everything seemed OK until I tried to remove a directory:
>
> May 19 19:03:51 ike kernel: xfs_force_shutdown(md(9,0),0x8) called from line 1039 of file xfs_trans.c.  Return address = 0xc01ed751
> May 19 19:03:51 ike kernel: Corruption of in-memory data detected.  Shutting down filesystem: md(9,0)
> May 19 19:03:51 ike kernel: Please umount the filesystem, and rectify the problem(s)
> [System froze!]

I think something went bad the moment that the raid5 went into degraded
mode. This shouldn't happen but you are wise in not automatically
repairing the fs.

>   to replay the log, and ran xfs_repair -n.  The output is included below.
>   I'm thinking of trying again later today, maybe with an internal
>   log again (which should be usable now, with 2.4.18, right?)  But the
>   crash above worries me.  Please let me know if there are any other
>   tests I should run on the crashed filesystem before starting over.

As stated aboce the internal log raid5 should be OK. I still have a system
raid5 with an internal log that did get a bit faster with the change to
2.4.18.

Since it looks like filesystem corruption I think it is best to be careful
from here on. I think making a backup now would be a good idea.

Cheers
Seth