xfs
[Top] [All Lists]

Re: Failing XFS filesystem underlying Ceph OSDs

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: Failing XFS filesystem underlying Ceph OSDs
From: Alex Gorbachev <ag@xxxxxxxxxxxxxxxxxxx>
Date: Sat, 4 Jul 2015 10:46:24 -0400
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20150703235141.GQ7943@dastard>
References: <CADb9451tB71D3XCqcOkDxzpzbdEHqwj7XCZUpL8yg1DzYbpwBw@xxxxxxxxxxxxxx> <20150703235141.GQ7943@dastard>
Hello Dave, thank you for the response. I got some recommendations on the ceph-users list that essentially pointed to the problem with vm.swappiness=0 and its new behavior - described hereÂhttps://www.percona.com/blog/2014/04/28/oom-relation-vm-swappiness0-new-kernel/

Basically setting it to 0 creates these OOM conditions due to never swapping anything out. So I changed these settings right away:

sysctl vm.swappiness=20 (can probably be 1 as per article)

sysctl vm.min_free_kbytes=262144


So far no issues, but I need to wait a week to see if anything shows up. Thank you for reviewing the error codes.


Alex


On Fri, Jul 3, 2015 at 7:51 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
On Fri, Jul 03, 2015 at 05:07:29AM -0400, Alex Gorbachev wrote:
> Hello, we are seeing this and similar errors on multiple Supermicro nodes
> running Ceph. OS is Ubuntu 14.04.2 with kernel 4.1
>
> Thank you for any info and troubleshooting advice.

Nothing to suggest that this is an XFS problem. Memory reclaim
triggered by network stack memory pressure is causing inode
eviction. While removing the page cache it's falling over in
the generic truncate code doing a radix tree lookup. That's all
generic code - XFS never touches the page cache radix tree directly.

I haven't seen this before - is this a new problem since you
upgraded your kernel to 4.1? Is it repeatable? if yes to both, then
a bisect may be in order to isolate the problematic commit...

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs

<Prev in Thread] Current Thread [Next in Thread>