Re: 2.6.39-rc4+: oom-killer busy killing tasks

Date: Mon, 2 May 2011 02:26:17 -0700 (PDT)
On Sun, 1 May 2011 at 18:01, Dave Chinner wrote:
> I really don't know why the xfs inode cache is not being trimmed. I
> really, really need to know if the XFS inode cache shrinker is
> getting blocked or not running - do you have those sysrq-w traces
> when near OOM I asked for a while back?

Here's another attempt at getting those:

  * messages-11.txt.gz & slabinfo-11.txt.bz2
    - oom-killer at 00:05:04
    - last sysrq-w to succeed at 00:05:03

  * messages-12.txt.gz & slabinfo-12.txt.bz2, along
    with meminfo-post-oom-12.txt & sysrq-w_post-oom-12.jpg could
    be more interesting:
    - last sysrq-w to succeed at 01:27:08
    - oom-killer at 01:27:11

   ...but after the OOM-killer was killing quite a few processes, MemFree
   showed 511236 kB free memory, yet ssh logins were still being killed.
   Finally I got a root shell on the box, issued sysrq-w again and even
   executed /bin/sync, which came back. But looking at the logs now 
   nothing went to the disk (/var/log resides on / which is a ext4 fs).
   See sysrq-w_post-oom-12.jpg for a sysrq-w I took 2381s after boot time,
   or 01:32 - syslog stopped on 01:27.

I shall try again with netconsole loggin or something...

HTH & thanks for looking into this,
