Re: [PATCH v2] mm,oom: exclude oom_task_origin processes if they are OOM-unkillable.
From: David Rientjes
Date: Tue Feb 23 2016 - 17:33:11 EST
On Tue, 23 Feb 2016, Michal Hocko wrote:
> > oom_badness() ranges from 0 (don't kill) to 1000 (please kill). It
> > factors in the setting of /proc/self/oom_score_adj to change that value.
> > That is where OOM_SCORE_ADJ_MIN is enforced.
>
> The question is whether the current placement of OOM_SCORE_ADJ_MIN
> is appropriate. Wouldn't it make more sense to check it in oom_unkillable_task
> instead?
oom_unkillable_task() deals with the type of task it is (init or kthread)
or being ineligible due to the memcg and cpuset placement. We want to
exclude them from consideration and also suppress them from the task dump
in the kernel log. We don't want to suppress oom disabled processes, we
really want to know their rss, for example. It could be renamed
is_ineligible_task().
> Sure, checking oom_score_adj under task_lock inside oom_badness will
> prevent from races but the question I raised previously was whether we
> actually care about those races? When would it matter? Is it really
> likely that the update happen during the oom killing? And if yes what
> prevents from the update happening _after_ the check?
>
It's not necessarily to take task_lock(), but find_lock_task_mm() is the
means we have to iterate threads to find any with memory attached. We
need that logic in oom_badness() to avoid racing with threads that have
entered exit_mm(). It's possible for a thread to have a non-NULL ->mm in
oom_scan_process_thread(), the thread enters exit_mm() without kill, and
oom_badness() can still find it to be eligible because other threads have
not exited. We still want to issue a kill to this process and task_lock()
protects the setting of task->mm to NULL: don't consider it to be a race
in setting oom_score_adj, consider it to be a race in unmapping (but not
freeing) memory in th exit path.