xfs
[Top] [All Lists]

Re: [xfs_check Out of memory: ]

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [xfs_check Out of memory: ]
From: Arkadiusz MiÅkiewicz <arekm@xxxxxxxx>
Date: Sun, 29 Dec 2013 12:57:13 +0100
Cc: xfs@xxxxxxxxxxx, "Stor??" <289471341@xxxxxx>, Jeff Liu <jeff.liu@xxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=maven.pl; s=maven; h=from:to:subject:date:user-agent:cc:references:in-reply-to :mime-version:content-type:content-transfer-encoding:message-id; bh=JO67b98VF3fbAjKJTndwBvU3Tt5O+6FfK2anfW5L6zk=; b=rGzXHUqv2iF5zuXGZADxhS+/vGCxDot+pc2SbhKi0dARCKZSL4OlEnliTGXGpn+L15 iJjvqQd8AlNBYLeg42yQc4nTIc6Jdgdr9gh0USAx9+XhZV/SltaXsBfaTY4rwBkfeXoq eJYfwsrHKGpnpKkq4K2nwuq1/4CTagLeJzUXM=
In-reply-to: <20131229095033.GL20579@dastard>
References: <tencent_3F12563342ED1D4E049D1123@xxxxxx> <201312280020.39244.arekm@xxxxxxxx> <20131229095033.GL20579@dastard>
User-agent: KMail/1.13.7 (Linux/3.12.6-dirty; KDE/4.12.0; x86_64; ; )
On Sunday 29 of December 2013, Dave Chinner wrote:
> On Sat, Dec 28, 2013 at 12:20:39AM +0100, Arkadiusz MiÅkiewicz wrote:
> > On Friday 27 of December 2013, Dave Chinner wrote:
> > > On Fri, Dec 27, 2013 at 09:07:22AM +0100, Arkadiusz MiÅkiewicz wrote:
> > > > On Friday 27 of December 2013, Jeff Liu wrote:
> > > > > On 12/27 2013 14:48 PM, Stor?? wrote:
[...]
> > > > This reminds me a question...
> > > > 
> > > > Could xfs_repair store its temporary data (some of that data, the
> > > > biggest parte) on disk instead of in memory?
> > > 
> > > Where on disk?
> > 
> > In directory/file that I'll tell it to use (since I usualy have few xfs
> > filesystems on single server and so far only one at a time breaks).
> 
> How is that any different from just adding swap space to the server?

It's different by allowing other services to work while repair is in progress. 
If swap gets eaten then entire server goes down on knees. Keeping thins on 
disk would mean that other services work uninterrupted and repair gets slow 
(but works).

> > Could xfs_repair tell kernel that this data should always end up on swap
> > first (allowing other programs/daemons to use regular memory) prehaps?
> > (Don't know interface that would allow to do that in kernel though).
> > That would be some half baked solution.
> 
> It's up to the kernel to manage what gets swapped and what doesn't.

I was hoping for some interface like fadvice FADV_DONTNEED but there is no 
similar thing for malloced memory I guess.

> I suppose you could use control groups to constrict the RAM
> xfs_repair uses, but how to configure such policy is way ouside my
> area of expertise.

Hmm, have to try, maybe that would work. Like setting up cgroup with 8GB ram 
limit and 40GB of swap. Other services would have their ram available. Good 
hint.

> > Right. So only "easy" task finding the one who understands the code and
> > can write such interface left. Anyone?
> > 
> > IMO ram usage is a real problem for xfs_repair and there has to be some
> > upstream solution other than "buy more" (and waste more) approach.
> 
> I think you are forgetting that developer time is *expensive* and
> *scarce*.

I'm aware of that and not expecting any developer to implement this (unless 
some developer hits the same problems and will have hw constrains ;)

> This is essentially a solved problem: An SSD in a USB3
> enclosure as a temporary swap device is by far the most cost
> effective way to make repair scale to arbitrary amounts of metadata.
> It certainly scales far better than developer time and testing
> resources...

Ok.

I'm not saying that everyone should now start adding "on disk" db for 
xfs_repair. I just think that that soulution would work, regardless of 
hardware and would make it possible to repair huge filesystems (with tons of 
metadata) even on low memory machines (without having to change hardware).

If there is interest among developers to implement this (obiously not) is 
another matter and shouldn't matter on discussing approach.

What is more interesting for me is talking about possible problems with on 
disk approach and not looking for a solution to my particular case.

> Cheers,
> 
> Dave.

ps. I'll go with 2x64GB or 2x128GB SSD in raid1 for swap space approach for my 
case.

-- 
Arkadiusz MiÅkiewicz, arekm / maven.pl

<Prev in Thread] Current Thread [Next in Thread>