Re: [PATCH 4.3-rc6] proc: fix oom_adj value read from /proc/<pid>/oom_adj

From: David Rientjes
Date: Wed Oct 21 2015 - 16:59:13 EST


On Tue, 20 Oct 2015, Hongjie Fang (方洪杰) wrote:

> The oom_adj's value reading through /proc/<pid>/oom_adj is different
> with the value written into /proc/<pid>/oom_adj.

/proc/pid/oom_adj is deprecated and has been for years. When writing to
/proc/pid/oom_adj, for legacy purposes, the value is converted to
/proc/pid/oom_score_adj. There is no exact way to do this since the
scales of the tunables are different (the former acted as a simple bit
shift on a badness score, the latter is a proportion of available memory).

You'll notice we never store the written oom_adj, and that's because after
the conversion to oom_score_adj is done, in units the oom killer actually
uses to make killing decisions, it is no longer interesting. Userspace
needs to only know what the effective policy is, and that may be different
because there is no 1:1 mapping for tunables of different units.

Rounding up positive oom_adj values and rounding down negative oom_adj
values, as your patch does, creates an inconsistency in how the mapping
has been done for years. It risks current users biasing against their
processes more than expected, so it's not a safe change to make as Eric
also suggested.