xfs
[Top] [All Lists]

Re: 2.4.20pre5aa2

To: Andrea Arcangeli <andrea@xxxxxxx>
Subject: Re: 2.4.20pre5aa2
From: Samuel Flory <sflory@xxxxxxxxxxxx>
Date: Thu, 12 Sep 2002 18:18:51 -0700
Cc: Austin Gonyou <austin@xxxxxxxxxxxxxxx>, Christian Guggenberger <christian.guggenberger@xxxxxxxxxxxxxxxxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, linux-xfs@xxxxxxxxxxx
References: <20020911201602.A13655@pc9391.uni-regensburg.de> <1031768655.24629.23.camel@UberGeek.coremetrics.com> <20020911184111.GY17868@dualathlon.random> <3D81235B.6080809@rackable.com> <20020913002316.GG11605@dualathlon.random>
Sender: linux-xfs-bounce@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020826
Andrea Arcangeli wrote:

On Thu, Sep 12, 2002 at 04:29:31PM -0700, Samuel Flory wrote:


Your patch seem to solve only some of the xfs issues for me. Before the patch my system hung when booting. This only occured I had xfs compiled into the kernel. After patching things seemed fine, but durning "dbench 32" the system locked. Upon rebooting and attempting to mount the filesystem I got this:
XFS mounting filesystem md(9,2)
Starting XFS recovery on filesystem: md(9,2) (dev: 9/2)
kernel BUG at page_buf.c:578!
<and so on>


PS- The results of ksymoops are attached.



that seems a bug in xfs, it BUG() if vmap fails, it must not BUG(), it must return -ENOMEM to userspace instead, or it can try to recollect and release some of the other vmalloced entries. Most probably you run into an address space shortage, not a real ram shortage, so to workaround it you can recompile with CONFIG_2G and it'll probably work, also dropping the gap page in vmalloc may help workaround it (there's no config option for it though). It could be also a vmap leak, maybe a missing vfree, just some idea.




The system has 4G of ram, and 4G of swap. So real memory is not an issue. The system is a intended to be an nfs server. As a result nfs performance is my only real concern. I should really use CONFIG_3GB as I'm not doing much in user space other a tftp, and dhcp server.


In any case the system isn't in production so I can leave it as is till monday.


<Prev in Thread] Current Thread [Next in Thread>