Re: [v6 1/4] mm, oom: refactor the oom_kill_process() function
From: Michal Hocko
Date: Thu Aug 24 2017 - 07:15:54 EST
This patch fails to apply on top of the mmotm tree. It seems the only
reason is the missing
http://lkml.kernel.org/r/20170810075019.28998-2-mhocko@xxxxxxxxxx
On Wed 23-08-17 17:51:57, Roman Gushchin wrote:
> The oom_kill_process() function consists of two logical parts:
> the first one is responsible for considering task's children as
> a potential victim and printing the debug information.
> The second half is responsible for sending SIGKILL to all
> tasks sharing the mm struct with the given victim.
>
> This commit splits the oom_kill_process() function with
> an intention to re-use the the second half: __oom_kill_process().
Yes this makes some sense even without further changes.
> The cgroup-aware OOM killer will kill multiple tasks
> belonging to the victim cgroup. We don't need to print
> the debug information for the each task, as well as play
> with task selection (considering task's children),
> so we can't use the existing oom_kill_process().
>
> Signed-off-by: Roman Gushchin <guro@xxxxxx>
> Cc: Michal Hocko <mhocko@xxxxxxxxxx>
> Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx>
> Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
> Cc: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Cc: David Rientjes <rientjes@xxxxxxxxxx>
> Cc: Tejun Heo <tj@xxxxxxxxxx>
> Cc: kernel-team@xxxxxx
> Cc: cgroups@xxxxxxxxxxxxxxx
> Cc: linux-doc@xxxxxxxxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Cc: linux-mm@xxxxxxxxx
I do agree with the patch there is just one thing to fix up.
> ---
> mm/oom_kill.c | 123 +++++++++++++++++++++++++++++++---------------------------
> 1 file changed, 65 insertions(+), 58 deletions(-)
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 53b44425ef35..5c29a3dd591b 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -817,67 +817,12 @@ static bool task_will_free_mem(struct task_struct *task)
> return ret;
> }
>
> -static void oom_kill_process(struct oom_control *oc, const char *message)
> +static void __oom_kill_process(struct task_struct *victim)
> {
[...]
> p = find_lock_task_mm(victim);
> if (!p) {
> put_task_struct(victim);
The context doesn't tell us but there is return right after this.
p = find_lock_task_mm(victim);
if (!p) {
put_task_struct(victim);
return;
} else if (victim != p) {
get_task_struct(p);
put_task_struct(victim);
victim = p;
}
So we return with the reference dropped. Moreover we can change
the victim, drop the reference on old one...
> +static void oom_kill_process(struct oom_control *oc, const char *message)
> +{
[...]
> + __oom_kill_process(victim);
> + put_task_struct(victim);
while we drop it here again and won't drop the changed one. If we race
with the exiting task and there is no mm then we we double drop as well.
So I think that __oom_kill_process should really drop the reference for
all cases and oom_kill_process shouldn't care. Or if you absolutely
need a guarantee that the victim won't go away after __oom_kill_process
then you need to return the real victim and let the caller to deal with
put_task_struct.
--
Michal Hocko
SUSE Labs