2.6.39-rc4+: oom-killer busy killing tasks

Date: Sun, 1 May 2011 21:59:35 -0700 (PDT)
On Sun, 1 May 2011 at 18:01, Dave Chinner wrote:
> I really don't know why the xfs inode cache is not being trimmed. I
> really, really need to know if the XFS inode cache shrinker is
> getting blocked or not running - do you have those sysrq-w traces
> when near OOM I asked for a while back?

I tried to generate those via /proc/sysrq-trigger (don't have a F13/Print 
Screen key), but the OOM killer kicks in prett fast - so fast thay my 
debug script, trying to generate sysrq-w every second was too late and the 
machine was already dead:

   * messages-10.txt.gz
   * slabinfo-10.txt.bz2

  - du(1) started at 12:25:16 (and immediately listed
    as "blocked" task)
  - the last sysrq-w succeeded at 12:38:05, listing kswapd0
  - du invoked oom-killer at 12:38:06

I'll keep trying...

> scan only scanned 516 pages. I can't see it freeing many inodes
> (there's >600,000 of them in memory) based on such a low page scan
> number.

Not sure if this is related...this XFS filesytem I'm running du(1) on is 
~1 TB in size, with 918K allocated inodes, if df(1) is correct:

# df -hi /mnt/backup/
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/mapper/wdc1         37M    918K     36M    3% /mnt/backup

> Maybe you should tweak /proc/sys/vm/vfs_cache_pressure to make it
> reclaim vfs structures more rapidly. It might help

/proc/sys/vm/vfs_cache_pressure is currently set to '100'. You mean I 
should increase it? To..150? 200? 1000?

