Re: Kernel 2.6.30: Memory/XFS leak, OOM killer kills many processes

Date: Thu, 11 Jun 2009 16:35:08 -0500
On Jun 11, 2009, at 12:31 PM, Eric Sandeen wrote:

Justin Piszcz wrote:

On Thu, 11 Jun 2009, Justin Piszcz wrote:


I have a daily cron that backs up my root filesystem using xfsdump, it has remain unchanged for at least 7-10 kernel versions. When I migrated to
2.6.30, when the xfsdump ran at its scheduled time, nearly all of my
processes were killed due to an OOM situation, I can reproduce the situation.

Kernel: 2.6.30
Dist: Debian Testing
xfsdump: 2.2.48-1

Kernel does not exhibit this problem:

xfsdump: estimated dump size: 8694781376 bytes
xfsdump: creating dump session media file 0 (media 0, file 0)
xfsdump: dumping ino map
xfsdump: dumping directories
xfsdump: dumping non-directory files
xfsdump: ending media file
xfsdump: media file size 8294709848 bytes
xfsdump: dump size (non-dir files) : 8208863560 bytes
xfsdump: dump complete: 102 seconds elapsed
xfsdump: Dump Status: SUCCESS

XFS(?) bug in 2.6.30.

Any chance for a bisect run? :)

Well, Hedi (@sgi) pointed out to the problem without
bisect :)

commit 28e211700a81b0a934b6c7a4b8e7dda843634d2f
Author: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Date:   Tue Feb 24 08:39:02 2009 -0500

  xfs: fix getbmap vs mmap deadlock

we do allocate memory for out

out = kmem_zalloc(bmv->bmv_count * sizeof(struct getbmapx), KM_MAYFAIL);

but I am not seeing where it's being released.

If I am reading the code correctly we need to handle the freeing in
in out_unlock_iolock.

The following should fix it:

diff --git a/fs/xfs/xfs_bmap.c b/fs/xfs/xfs_bmap.c
index 4b0f6ef..7928b99 100644
--- a/fs/xfs/xfs_bmap.c
+++ b/fs/xfs/xfs_bmap.c
@@ -6086,6 +6086,7 @@ xfs_getbmap(

+       kmem_free(out);
       return error;


Or, just as a thought, watch slabtop while you run the dump?

