Re: [PATCH] staging, android: remove lowmemory killer from the tree
From: peter enderborg
Date: Fri Feb 24 2017 - 10:40:57 EST
On 02/24/2017 04:03 PM, Michal Hocko wrote:
> On Fri 24-02-17 15:42:49, peter enderborg wrote:
>> On 02/24/2017 03:11 PM, Michal Hocko wrote:
>>> On Fri 24-02-17 14:16:34, peter enderborg wrote:
>>>> On 02/24/2017 01:28 PM, Michal Hocko wrote:
>>> [...]
>>>>> Yeah, I strongly believe that the chosen approach is completely wrong.
>>>>> Both in abusing the shrinker interface and abusing oom_score_adj as the
>>>>> only criterion for the oom victim selection.
>>>> No one is arguing that shrinker is not problematic. And would be great
>>>> if it is removed from lmk. The oom_score_adj is the way user-space
>>>> tells the kernel what the user-space has as prio. And android is using
>>>> that very much. It's a core part.
>>> Is there any documentation which describes how this is done?
>>>
>>>> I have never seen it be used on
>>>> other linux system so what is the intended usage of oom_score_adj? Is
>>>> this really abusing?
>>> oom_score_adj is used to _adjust_ the calculated oom score. It is not a
>>> criterion on its own, well, except for the extreme sides of the range
>>> which are defined to enforce resp. disallow selecting the task. The
>>> global oom killer calculates the oom score as a function of the memory
>>> consumption. Your patch simply ignores the memory consumption (and uses
>>> pids to sort tasks with the same oom score which is just mind boggling)
>> How much it uses is of very little importance for android.
> But it is relevant for the global oom killer which is the main consumer of
> the oom_score_adj.
>
>> The score
>> used are only for apps and their services. System related are not
>> touched by android lmk. The pid is only to have a unique key to be
>> able to have it fast within a rbtree. One idea was to use task_pid to
>> get a strict age of process to get a round robin but since it does not
>> matter i skipped that idea since it does not matter.
> Pid will not tell you anything about the age. Pids do wrap around.
>
>>> and that is what I call the abuse. The oom score calculation might
>>> change in future, of course, but all consumers of the oom_score_adj
>>> really have to agree on the base which is adjusted by this tunable
>>> otherwise you can see a lot of unexpected behavior.
>> Then can we just define a range that is strictly for user-space?
> This is already well defined. The whole range OOM_SCORE_ADJ_{MIN,MAX}
> is usable.
So we use them in userspace and kernel space but where is the abuse then?
>>> I would even argue that nobody outside of mm/oom_kill.c should really
>>> have any business with this tunable. You can of course tweak the value
>>> from the userspace and help to chose a better oom victim this way but
>>> that is it.
>> Why only help? If userspace can give an exact order to kernel that
>> must be a good thing; other wise kernel have to guess and when
>> can that be better?
> Because userspace doesn't know who is the best victim in 99% cases.
If user-space does not tell kernel what to it have to guess, android
user-space does, and maybe other should too.
> Android might be different, although, I am a bit skeptical - especially
> after hearing quite some complains about random application being
> killed... If you do believe that you know better then, by all means,
> implement your custom user space LMK and chose the oom victim on a
> different basis but try to understand that the global OOM killer is the
> last resort measure to make the system usable again. There is a good
> reason why the kernel uses the current badness calculation. The previous
> implementation which considered the process age ad other things was just
> too random to have a understandable behavior.
I think it make sense that there is only one way to describe what is
important what is not. And oom_kill is the last resort is one problem
for android. Android lowmemorykiller balance memory usage and
tries to be more proactive and that is why shrinkers work so well.
> In any case playing nasty games with the oom killer tunables might and
> will lead, well, to unexpected behavior.
I don't follow. If we only use values OOM_SCORE_ADJ_{MIN,MAX} can
we then be "safe"?