xfs
[Top] [All Lists]

Re: file corruption issue

To: Patrick Shirkey <pshirkey@xxxxxxxxxxxxxxxxx>
Subject: Re: file corruption issue
From: Ben Myers <bpm@xxxxxxx>
Date: Mon, 14 May 2012 09:29:48 -0500
Cc: xfs@xxxxxxxxxxx
In-reply-to: <59946.110.174.53.110.1336959906.squirrel@xxxxxxxxxxxxxxxxx>
References: <51509.110.174.53.110.1336699622.squirrel@xxxxxxxxxxxxxxxxx> <20120511165012.GC16099@xxxxxxx> <59946.110.174.53.110.1336959906.squirrel@xxxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
Hey Patrick,

On Mon, May 14, 2012 at 03:45:06AM +0200, Patrick Shirkey wrote:
> 
> On Fri, May 11, 2012 6:50 pm, Ben Myers wrote:
> > On Fri, May 11, 2012 at 03:27:02AM +0200, Patrick Shirkey wrote:
> >> I have some HP machines running centos:
> >>
> >> kernel 2.6.32-042stab049.6
> >> AMD Opteron(tm) Processor 6180 SE
> >> RAM:   528 GB
> >> RAID bus controller: Hewlett-Packard Company Smart Array G6 controllers
> >>
> >> We have experienced some kernel crashes due to a kernel bug with
> >> interleaving ram on this hardware which require hard reset of the
> >> machines.
> >>
> >> After reboot we are finding that there is severe file corruption on the
> >> xfs file system where TBs of readonly databases are getting partially or
> >> fully truncated.
> >>
> >> Has anyone come across this or similar?
> >
> > This rings a bell for me but I can't be certain.  Could you provide a
> > metadump?
> >
> 
> The machines are live so we have already restored the data several times.
> Will a metadump from the existing file system be useful or do you need it
> post crash?

Well... one of each would be best.  It might be helpful to compare the block
map from before the crash with the block map after the crash for one of the
read-only corrupted databases.

Regards,
        Ben

<Prev in Thread] Current Thread [Next in Thread>