Re: [PATCH 0/3] OOM detection rework v4

From: Hugh Dickins
Date: Wed Feb 24 2016 - 22:47:21 EST

On Wed, 3 Feb 2016, Michal Hocko wrote:
> Hi,
> this thread went mostly quite. Are all the main concerns clarified?
> Are there any new concerns? Are there any objections to targeting
> this for the next merge window?

Sorry to say at this late date, but I do have one concern: hopefully
you can tweak something somewhere, or point me to some tunable that
I can adjust (I've not studied the patches, sorry).

This rework makes it impossible to run my tmpfs swapping loads:
they're soon OOM-killed when they ran forever before, so swapping
does not get the exercise on mmotm that it used to. (But I'm not
so arrogant as to expect you to optimize for my load!)

Maybe it's just that I'm using tmpfs, and there's code that's conscious
of file and anon, but doesn't cope properly with the awkward shmem case.

(Of course, tmpfs is and always has been a problem for OOM-killing,
given that it takes up memory, but none is freed by killing processes:
but although that is a tiresome problem, it's not what either of us is
attacking here.)

Taking many of the irrelevancies out of my load, here's something you
could try, first on v4.5-rc5 and then on mmotm.

Boot with mem=1G (or boot your usual way, and do something to occupy
most of the memory: I think /proc/sys/vm/nr_hugepages provides a great
way to gobble up most of the memory, though it's not how I've done it).

Make sure you have swap: 2G is more than enough. Copy the v4.5-rc5
kernel source tree into a tmpfs: size=2G is more than enough.
make defconfig there, then make -j20.

On a v4.5-rc5 kernel that builds fine, on mmotm it is soon OOM-killed.

Except that you'll probably need to fiddle around with that j20,
it's true for my laptop but not for my workstation. j20 just happens
to be what I've had there for years, that I now see breaking down
(I can lower to j6 to proceed, perhaps could go a bit higher,
but it still doesn't exercise swap very much).

This OOM detection rework significantly lowers the number of jobs
which can be run in parallel without being OOM-killed. Which would
be welcome if it were choosing to abort in place of thrashing, but
the system was far from thrashing: j20 took a few seconds more than
j6, and even j30 didn't take 50% longer.

(I have /proc/sys/vm/swappiness 100, if that matters.)

I hope there's an easy answer to this: thanks!