Bugzilla – Bug 290
XFS Kernel Memory Leak
Last modified: 2005-01-22 15:52:55 CST
This may (or may not) be related to bug 209, but here's what I have... There is a memory leak in the opendir()/readdir()/closedir() code. I've tested this on the following kernels: 2.6.0-test9 2.6.0-test9-mm2 2.6.0-test9-mm5 I noticed this first when locate was updating its database every night... the next morning nearly all available memory would be used and I'd have to reboot the machine to reclaim it. I wrote the following Perl script to test this: ---------------------------------------------- #!/usr/bin/perl open_dir("/"); sub open_dir { my $pwd = shift; opendir(DIR, $pwd); my @dentries = readdir(DIR); closedir(DIR); foreach my $de (@dentries) { next if $de =~ /^\./; print "$pwd$de\n"; if (! -l "$pwd$de" && -d "$pwd$de") { open_dir("$pwd$de/"); } } } ---------------------------------------------- I ran this script on my machine on a fresh boot, and vmstat at the same time, and came up with a very gradual loss of memory (it looks linear in nature to my untrained eye). You can get the entire vmstat log from http://www.condordes.net/~condor/vmstat.log . As a contrast, here's a vmstat row from near the beginning: procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 453992 11056 21288 0 0 776 50 1111 299 1 3 97 0 And here's one towards the end: procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 0 0 3440 111464 7456 0 0 14 67 1010 42 1 98 1 0 During the run I also ran 'top' to make sure there weren't other processes eating up RAM, so I'm fairly sure most of this lost memory is in kernel space. If you have any additional tests you want me to run, please let me know.
Any chance you can try this same test on a non-xfs box? Just to narrow down whether it's xfs that's causing the problem... Thanks, -Eric
Sadly (:p), all my machines are XFS. I'll ask around and see if I can find someone who is running 2.6 without XFS and is willing to test this. I think I can also confirm this bug on 2.4.20-xfs-r3 (a Gentoo kernel), as my server has now started exhibiting signs of memory leakage. If I don't find anyone to test this weekend, I'll see about putting a mini-install on my laptop without XFS (I have a spare 1GB partition to play with).
No problem, I can test this on an ext3 box. Just figured if you had one handy you could give it a whirl Thanks, -Eric
Actually, what makes you certain that this is a leak, rather than normal caching, etc? My hunch is that the inode_cache is just getting filled up.
Well, I think it's a leak largely because it eats non-buffer, non-cached RAM, and when I try to load up programs that consume RAM (Mozilla or KDE, for instance), it starts swapping, sometimes quite a lot. On a fresh boot before running this test, it doesn't need to swap at all. Plus, in this test, most of the used RAM doesn't show up under buffers or cache... it just sort of disappears from free. If you take the vmstat output, dump it into a spreadsheet and add up all the free/buffers/cache RAM for each vmstat line, you'd expect that number to remain mostly constant. (Instead it steadily decreases.) The Perl program uses a fairly constant amount of RAM, there's nothing else running (single-user mode), and since there's little change in userspace RAM usage, what's subtracted/added from the free gets added/subtracted to the buffers and cache. (This is on a machine with 512MB RAM, btw.) I'm assuming here that the inode cache is included in the "cache" statistic. Sadly I know very little about kernel programming (or internals, for that matter, beyond general ideas), so I could easily be wrong.
Can you retry with 2.6.1-rc1? It has fixes to better reclaim inodes and dentries.
No feedback for about half a year. Please reopen if you can still reproduce it with a current kernel.
i'm using Kernel 2.6.7 and 2.6.8.1.. and every night that updatedb would run.. i'd go from 450MB Free to 120MB Free and 568KB Swap Used.. (using the command free to view memory statistics).. i tried the perl script provided previously and witnessed the same memory usage.. and after ctrl-c'ing it the memory was never reclaimed... there is still a problem in my opinion...
well.. this isn't a bug specific to XFS Filesystem... i changed my root partition to ext3 and unmounted all XFS partitions... needless to say.. running updatedb revealed the same problems...memory usage exploding through the roof.. http://bugs.gentoo.org/show_bug.cgi?id=36855 this puts some more insight into the topic.. but.. i ran "emerge sync" (uses rsync).. and the ram used up was _not_ able to be reclaimed... i guess "Resolving this as INVALID" makes sense cause it's not specific to XFS as far as i can see...