xfs
[Top] [All Lists]

Re: XFS crashes on VMs

To: Shrinand Javadekar <shrinand@xxxxxxxxxxxxxx>
Subject: Re: XFS crashes on VMs
From: Eric Sandeen <sandeen@xxxxxxxxxxx>
Date: Wed, 27 May 2015 19:53:18 -0500
Cc: "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <556666B2.8060301@xxxxxxxxxxx>
References: <CABppvi4FZKTsu22uk6nOSaShJOUyWg7cO6h-i4YekF4MLKH8RQ@xxxxxxxxxxxxxx> <8BF495B8-F444-415A-B7BF-5E1961C75817@xxxxxxxxxxx> <CABppvi5ODWyjMw0pouAQ__SQ-aGb48PEyTCPU7Dt65Z1DhKmVA@xxxxxxxxxxxxxx> <556666B2.8060301@xxxxxxxxxxx>
And did anything else "interesting" happen prior to the detection?

> On May 27, 2015, at 7:52 PM, Eric Sandeen <sandeen@xxxxxxxxxxx> wrote:
> 
> You'll need to try to narrow down how it happened.
> 
> The hexdumps in the logs show what data was in the buffer; in one case it was 
> ascii, and was definitely not xfs metadata.
> 
> Either:
> 
> a) xfs wrote the wrong metadata - almost impossible, because we verify the 
> data on write in the same way as we do on read
> 
> b) xfs read the wrong block due to other metadata corruption.
> 
> c) something corrupted the storage after it was written
> 
> d) the storage returned the wrong data on a read request ...
> 
> e) ???
> 
> Did you save the xfs_repair output?  That might offer more clues.
> 
> Unless you can reproduce it, it'll be hard to come up with a definitive root 
> cause... can you try?
> 
> -Eric
> 
> 
>> On 5/27/15 7:03 PM, Shrinand Javadekar wrote:
>> Thanks Eric,
>> 
>> We ran xfs_repair and were able to get it back into a running state.
>> This is fine for a test & dev but in production it won't be
>> acceptable. What other data do we need to get to the bottom of this?
>> 
>>> On Wed, May 27, 2015 at 4:27 PM, Eric Sandeen <sandeen@xxxxxxxxxxx> wrote:
>>> That's not a crash. That is xfs detecting on disk corruption which likely 
>>> happened at some time prior. You should unmount and run xfs_repair, 
>>> possibly with ân first if you would like to do a dry run to see what it 
>>> might do.  If you get fresh corruption after a full repair, then that 
>>> becomes more interesting. It's possible that you have a problem with the 
>>> underlying block layer or it's possible that it is an xfs bug -  but I 
>>> think this is not something that we have seen before.
>>> 
>>> Eric
>>> 
>>>> On May 27, 2015, at 6:06 PM, Shrinand Javadekar <shrinand@xxxxxxxxxxxxxx> 
>>>> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> I am running Openstack Swift in a VM with XFS as the underlying
>>>> filesystem. This is generating a metadata heavy workload on XFS.
>>>> Essentially, it is creating a new directory and a new file (256KB) in
>>>> that directory. This file has extended attributes of size 243 bytes.
>>>> 
>>>> I am seeing the following two crashes of the machine:
>>>> 
>>>> http://pastie.org/pastes/10210974/text?key=xdmfvaocvawnyfmkb06zg
>>>> 
>>>> AND
>>>> 
>>>> http://pastie.org/pastes/10210975/text?key=rkiljsdaucrk7frprzgqq
>>>> 
>>>> I have only seen these when running in a VM. We have run several tests
>>>> on physical server but have never seen these problems.
>>>> 
>>>> Are there any known issues with XFS running on VMs?
>>>> 
>>>> Thanks in advance.
>>>> -Shri
>>>> 
>>>> _______________________________________________
>>>> xfs mailing list
>>>> xfs@xxxxxxxxxxx
>>>> http://oss.sgi.com/mailman/listinfo/xfs
>> 
>> _______________________________________________
>> xfs mailing list
>> xfs@xxxxxxxxxxx
>> http://oss.sgi.com/mailman/listinfo/xfs
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs

<Prev in Thread] Current Thread [Next in Thread>