Bug 290 - XFS Kernel Memory Leak
: XFS Kernel Memory Leak
Status: RESOLVED INVALID
Product: XFS
Classification: Unclassified
Component: XFS kernel code
: unspecified
: Linux
: major
: ---
Assigned To: XFS power people
:
:
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2003-11-22 14:05 CST by Josh Berry
Modified: 2005-01-22 15:52 CST (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Josh Berry 2003-11-22 14:05:52 CST
This may (or may not) be related to bug 209, but here's what I have...

There is a memory leak in the opendir()/readdir()/closedir() code.  I've tested
this on the following kernels:

2.6.0-test9
2.6.0-test9-mm2
2.6.0-test9-mm5

I noticed this first when locate was updating its database every night... the
next morning nearly all available memory would be used and I'd have to reboot
the machine to reclaim it.

I wrote the following Perl script to test this:

----------------------------------------------
#!/usr/bin/perl

open_dir("/");

sub open_dir {
    my $pwd = shift;
    opendir(DIR, $pwd);
    my @dentries = readdir(DIR);
    closedir(DIR);

    foreach my $de (@dentries) {
        next if $de =~ /^\./;

        print "$pwd$de\n";

        if (! -l "$pwd$de" && -d "$pwd$de") {
            open_dir("$pwd$de/");
        }
    }
}
----------------------------------------------

I ran this script on my machine on a fresh boot, and vmstat at the same time,
and came up with a very gradual loss of memory (it looks linear in nature to my
untrained eye).  You can get the entire vmstat log from
http://www.condordes.net/~condor/vmstat.log .

As a contrast, here's a vmstat row from near the beginning:

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 0  0      0 453992  11056  21288    0    0   776    50 1111   299  1  3 97  0

And here's one towards the end:

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 2  0      0   3440 111464   7456    0    0    14    67 1010    42  1 98  1  0

During the run I also ran 'top' to make sure there weren't other processes
eating up RAM, so I'm fairly sure most of this lost memory is in kernel space.

If you have any additional tests you want me to run, please let me know.
Comment 1 Eric Sandeen 2003-11-22 19:05:37 CST
Any chance you can try this same test on a non-xfs box?
Just to narrow down whether it's xfs that's causing the problem...

Thanks,

-Eric
Comment 2 Josh Berry 2003-11-22 20:03:35 CST
Sadly (:p), all my machines are XFS.  I'll ask around and see if I can find
someone who is running 2.6 without XFS and is willing to test this.

I think I can also confirm this bug on 2.4.20-xfs-r3 (a Gentoo kernel), as my
server has now started exhibiting signs of memory leakage.

If I don't find anyone to test this weekend, I'll see about putting a
mini-install on my laptop without XFS (I have a spare 1GB partition to play with).
Comment 3 Eric Sandeen 2003-11-23 09:29:04 CST
No problem, I can test this on an ext3 box.  Just figured if you had
one handy you could give it a whirl

Thanks,
-Eric
Comment 4 Eric Sandeen 2003-11-24 15:05:25 CST
Actually, what makes you certain that this is a leak, rather than normal
caching, etc?  My hunch is that the inode_cache is just getting filled up.
Comment 5 Josh Berry 2003-11-24 17:16:22 CST
Well, I think it's a leak largely because it eats non-buffer, non-cached RAM,
and when I try to load up programs that consume RAM (Mozilla or KDE, for
instance), it starts swapping, sometimes quite a lot.  On a fresh boot before
running this test, it doesn't need to swap at all.

Plus, in this test, most of the used RAM doesn't show up under buffers or
cache... it just sort of disappears from free.  If you take the vmstat output,
dump it into a spreadsheet and add up all the free/buffers/cache RAM for each
vmstat line, you'd expect that number to remain mostly constant.  (Instead it
steadily decreases.)  The Perl program uses a fairly constant amount of RAM,
there's nothing else running (single-user mode), and since there's little change
in userspace RAM usage, what's subtracted/added from the free gets
added/subtracted to the buffers and cache.

(This is on a machine with 512MB RAM, btw.)

I'm assuming here that the inode cache is included in the "cache" statistic. 
Sadly I know very little about kernel programming (or internals, for that
matter, beyond general ideas), so I could easily be wrong.
Comment 6 Christoph Hellwig 2004-01-01 17:02:56 CST
Can you retry with 2.6.1-rc1?  It has fixes to better reclaim inodes and dentries.
Comment 7 Christoph Hellwig 2004-06-15 10:00:30 CDT
No feedback for about half a year.  Please reopen if you can still reproduce it
with a current kernel.
Comment 8 Erik Anderson 2005-01-21 10:33:16 CST
i'm using Kernel 2.6.7 and 2.6.8.1.. and every night that updatedb would run..
i'd go from 450MB Free to 120MB Free and 568KB Swap Used.. (using the command
free to view memory statistics)..

i tried the perl script provided previously and witnessed the same memory
usage.. and after ctrl-c'ing it the memory was never reclaimed... there is still
a problem in my opinion...
Comment 9 Erik Anderson 2005-01-22 13:52:55 CST
well.. this isn't a bug specific to XFS Filesystem... i changed my root
partition to ext3 and unmounted all XFS partitions... needless to say.. running
updatedb revealed the same problems...memory usage exploding through the roof..

http://bugs.gentoo.org/show_bug.cgi?id=36855

this puts some more insight into the topic.. but.. i ran "emerge sync" (uses
rsync).. and the ram used up was _not_ able to be reclaimed...

i guess "Resolving this as INVALID" makes sense cause it's not specific to XFS
as far as i can see...