XFS crashes on VMs
Shrinand Javadekar
shrinand at maginatics.com
Fri Jun 19 13:34:36 CDT 2015
I hit this problem again and captured the output of all the steps
while repairing the filesystem. Here's the crash:
http://pastie.org/private/prift1xjcc38s0jcvehvew
And the output of the xfs_repair steps (also attached if needed):
http://pastie.org/private/gvq3aiisudfhy69ezagw
Hope this can provide some insights.
-Shri
On Thu, May 28, 2015 at 11:08 AM, Shrinand Javadekar
<shrinand at maginatics.com> wrote:
> We'll try and reproduce this and capture the output of xfs_repair when
> it happens next. Will keep an eye on what else was happening in the
> infrastructure when it happens.
>
> FWIW, we've seen this in local VMware environment as well as when we
> were running on Amazon EC2 instances. So it doesn't seem hypervisor
> specific.
>
> On Wed, May 27, 2015 at 5:53 PM, Eric Sandeen <sandeen at sandeen.net> wrote:
>> And did anything else "interesting" happen prior to the detection?
>>
>>> On May 27, 2015, at 7:52 PM, Eric Sandeen <sandeen at sandeen.net> wrote:
>>>
>>> You'll need to try to narrow down how it happened.
>>>
>>> The hexdumps in the logs show what data was in the buffer; in one case it was ascii, and was definitely not xfs metadata.
>>>
>>> Either:
>>>
>>> a) xfs wrote the wrong metadata - almost impossible, because we verify the data on write in the same way as we do on read
>>>
>>> b) xfs read the wrong block due to other metadata corruption.
>>>
>>> c) something corrupted the storage after it was written
>>>
>>> d) the storage returned the wrong data on a read request ...
>>>
>>> e) ???
>>>
>>> Did you save the xfs_repair output? That might offer more clues.
>>>
>>> Unless you can reproduce it, it'll be hard to come up with a definitive root cause... can you try?
>>>
>>> -Eric
>>>
>>>
>>>> On 5/27/15 7:03 PM, Shrinand Javadekar wrote:
>>>> Thanks Eric,
>>>>
>>>> We ran xfs_repair and were able to get it back into a running state.
>>>> This is fine for a test & dev but in production it won't be
>>>> acceptable. What other data do we need to get to the bottom of this?
>>>>
>>>>> On Wed, May 27, 2015 at 4:27 PM, Eric Sandeen <sandeen at sandeen.net> wrote:
>>>>> That's not a crash. That is xfs detecting on disk corruption which likely happened at some time prior. You should unmount and run xfs_repair, possibly with –n first if you would like to do a dry run to see what it might do. If you get fresh corruption after a full repair, then that becomes more interesting. It's possible that you have a problem with the underlying block layer or it's possible that it is an xfs bug - but I think this is not something that we have seen before.
>>>>>
>>>>> Eric
>>>>>
>>>>>> On May 27, 2015, at 6:06 PM, Shrinand Javadekar <shrinand at maginatics.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am running Openstack Swift in a VM with XFS as the underlying
>>>>>> filesystem. This is generating a metadata heavy workload on XFS.
>>>>>> Essentially, it is creating a new directory and a new file (256KB) in
>>>>>> that directory. This file has extended attributes of size 243 bytes.
>>>>>>
>>>>>> I am seeing the following two crashes of the machine:
>>>>>>
>>>>>> http://pastie.org/pastes/10210974/text?key=xdmfvaocvawnyfmkb06zg
>>>>>>
>>>>>> AND
>>>>>>
>>>>>> http://pastie.org/pastes/10210975/text?key=rkiljsdaucrk7frprzgqq
>>>>>>
>>>>>> I have only seen these when running in a VM. We have run several tests
>>>>>> on physical server but have never seen these problems.
>>>>>>
>>>>>> Are there any known issues with XFS running on VMs?
>>>>>>
>>>>>> Thanks in advance.
>>>>>> -Shri
>>>>>>
>>>>>> _______________________________________________
>>>>>> xfs mailing list
>>>>>> xfs at oss.sgi.com
>>>>>> http://oss.sgi.com/mailman/listinfo/xfs
>>>>
>>>> _______________________________________________
>>>> xfs mailing list
>>>> xfs at oss.sgi.com
>>>> http://oss.sgi.com/mailman/listinfo/xfs
>>>
>>> _______________________________________________
>>> xfs mailing list
>>> xfs at oss.sgi.com
>>> http://oss.sgi.com/mailman/listinfo/xfs
-------------- next part --------------
A non-text attachment was scrubbed...
Name: xfs_crash
Type: application/octet-stream
Size: 2073 bytes
Desc: not available
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20150619/1c3aafe5/attachment.obj>
More information about the xfs
mailing list