Re: [PATCH] mm/oom_kill: count global and memory cgroup oom kills

From: Roman Guschin
Date: Mon May 22 2017 - 14:06:03 EST


2017-05-22 10:11 GMT+01:00 Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx>:
>
>
> On 19.05.2017 19:34, Roman Guschin wrote:
>>
>> 2017-05-19 15:22 GMT+01:00 Konstantin Khlebnikov
>> <khlebnikov@xxxxxxxxxxxxxx>:
>> From a user's point of view the difference between "oom" and "max"
>> becomes really vague here,
>> assuming that "max" is described almost in the same words:
>>
>> "The number of times the cgroup's memory usage was
>> about to go over the max boundary. If direct reclaim
>> fails to bring it down, the OOM killer is invoked."
>>
>> I wonder, if it's better to fix the existing "oom" value to show what
>> it has to show, according to docs,
>> rather than to introduce a new one?
>>
>
> Nope, they are different. I think we should rephase documentation somehow
>
> low - count of reclaims below low level
> high - count of post-allocation reclaims above high level
> max - count of direct reclaims
> oom - count of failed direct reclaims
> oom_kill - count of oom killer invocations and killed processes

Definitely worth it.

Also, I would prefer to reserve "oom" for number of oom victims,
and introduce something like "reclaim_failed".
It will be consistent with existing vmstat.

Thanks!