xfs
[Top] [All Lists]

Re: silent corruption after kernel panic?

To: "Assarsson, Emil" <Emil.Assarsson@xxxxxxxxxxxxxxxx>
Subject: Re: silent corruption after kernel panic?
From: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Date: Mon, 19 Sep 2011 10:27:29 -0400
Cc: "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>
In-reply-to: <2BF070A7A2375D46BA1B6087F8D5DCB68BEA722B40@xxxxxxxxxxxxxxxxxxxxxxx>
References: <2BF070A7A2375D46BA1B6087F8D5DCB68BEA722B40@xxxxxxxxxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Mon, Sep 19, 2011 at 02:28:23PM +0200, Assarsson, Emil wrote:
> Hi,
> 
> We are running a 20TB XFS filesystem on top of LVM2 and SAN storage (HP
> Open-V) with multipathd. Ubuntu Lucid. The disk write cache is enabled
> and we use mount options rw.


> Sep 16 06:40:34 seldlnx034 kernel: [54607.977261] XFS internal error
> XFS_WANT_CORRUPTED_RETURN at line 381 of
> file /build/buildd/linux-2.6.32/fs/xfs/xfs_alloc.c.  Caller
> 0xffffffffa01eed36
> Sep 16 06:40:34 seldlnx034 kernel: [54607.996676]  [<ffffffffa0215383>]
> xfs_error_report+0x43/0x50 [xfs]
> Sep 16 06:40:34 seldlnx034 kernel: [54607.996689]

This (corrupted allocation btrees) is a typical indication of missing
cache flushes.

Given that before ~2.6.35 LVM/device mapper was not able to pass through
cache flush requests that is your most likely culprit.  A repair will
rebuild the freespace btrees, and make sure to keep the write caches
down the whole stack disabled.

<Prev in Thread] Current Thread [Next in Thread>