xfs
[Top] [All Lists]

Re: Issue with 2.6.23 and drbd 8.0.7

To: Laurent CARON <lcaron@xxxxxxxxx>
Subject: Re: Issue with 2.6.23 and drbd 8.0.7
From: David Chinner <dgc@xxxxxxx>
Date: Tue, 18 Dec 2007 10:37:59 +1100
Cc: David Chinner <dgc@xxxxxxx>, xfs@xxxxxxxxxxx
In-reply-to: <4766F58C.8040000@xxxxxxxxx>
References: <20071217143655.chiehahh@xxxxxxxxxxxxxxxxx> <20071217220354.GU4396912@xxxxxxx> <4766F58C.8040000@xxxxxxxxx>
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: Mutt/1.4.2.1i
On Mon, Dec 17, 2007 at 11:17:48PM +0100, Laurent CARON wrote:
> David Chinner wrote:
> > The symptoms you see are the machine running out of memory and the OOM
> > killer being invoked. There's nothing XFS here - you'd do better to post
> > to lkml about this.
> 
> So, I was wrong .... :$
> 
> > Hmmm - you appear to have a highmem based box and have run out of
> > low memory for the kernel. So while having ~9.5GB of free high
> > memory (that the kernel can't directly use), you're out of low
> > memory that the kernel can use and hence it is going OOM.  The
> > output of /proc/slabinfo or watching slabtop will tell you where
> > most of this memory is going.
> 
> Please find attached the output from /proc/slabinfo from both servers,
> as well as output from slabtop from server 1.
> 
> > 
> > FWIW, I suggest upgrading to a 64 bit machine ;)
> 
> I'm currently migrating those 2 servers to 2 64 Bit setups ;)
> 
> Thanks for your advice.
> 
> Laurent

> slabinfo - version: 2.1 (statistics)
> # name            <active_objs> <num_objs> <objsize> <objperslab> 
> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata 
> <active_slabs> <num_slabs> <sharedavail> : globalstat <listallocs> <maxobjs> 
> <grown> <reaped> <error> <maxfreeable> <nodeallocs> <remotefrees> 
> <alienoverflow> : cpustat <allochit> <allocmiss> <freehit> <freemiss>
> xfs_inode         227129 245574    408    9    1 : tunables   32   16    8 : 
> slabdata  27286  27286
> xfs_vnode         227106 243130    392   10    1 : tunables   32   16    8 : 
> slabdata  24313  24313
> radix_tree_node    88310  88356    312   12    1 : tunables   32   16    8 : 
> slabdata   7363   7363
> dentry            170738 215280    160   24    1 : tunables   32   16    8 : 
> slabdata   8970   8970
> buffer_head       150095 460752     80   48    1 : tunables   32   16    8 : 
> slabdata   9599   9599

> slabinfo - version: 2.1 (statistics)
> xfs_inode         386493 386505    408    9    1 : tunables   32   16    8 : 
> slabdata  42945  42945
> xfs_vnode         386491 386510    392   10    1 : tunables   32   16    8 : 
> slabdata  38651  38651
> radix_tree_node    56266  56292    312   12    1 : tunables   32   16    8 : 
> slabdata   4691   4691
> dentry            425976 425976    160   24    1 : tunables   32   16    8 : 
> slabdata  17749  17749
> buffer_head       794845 794976     80   48    1 : tunables   32   16    8 : 
> slabdata  16562  16562

>  Active / Total Objects (% used)    : 1031308 / 1501486 (68.7%)
>  Active / Total Slabs (% used)      : 87577 / 87659 (99.9%)
>  Active / Total Caches (% used)     : 116 / 179 (64.8%)
>  Active / Total Size (% used)       : 275759.16K / 331390.36K (83.2%)
>  Minimum / Average / Maximum Object : 0.04K / 0.22K / 4096.00K
> 
>   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
> 460752 150236  32%    0.08K   9599       48     38396K buffer_head
> 244413 225674  92%    0.40K  27157        9    108628K xfs_inode
> 242010 225657  93%    0.38K  24201       10     96804K xfs_vnode
> 215280 171465  79%    0.16K   8970       24     35880K dentry
>  88368  88272  99%    0.30K   7364       12     29456K radix_tree_node

Hmmm - no real surprises there, but the numbers are well lower than the
~960MB low memory limit. I suspect that there's something at around
2.55am that does a filesystem traversal and that blows out the memory
usage of these slab caches and you run out of lowmem...

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group


<Prev in Thread] Current Thread [Next in Thread>