Re: [PATCH] mm: oom: Fix race condition between oom_badness and do_exit of task

From: David Rientjes
Date: Wed Mar 07 2018 - 15:56:32 EST


On Wed, 7 Mar 2018, Gaurav Kohli wrote:

> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 6fd9773..5f4cc4b 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -114,9 +114,11 @@ struct task_struct *find_lock_task_mm(struct task_struct *p)
>
> for_each_thread(p, t) {
> task_lock(t);
> + get_task_struct(t);
> if (likely(t->mm))
> goto found;
> task_unlock(t);
> + put_task_struct(t);
> }
> t = NULL;
> found:

We hold rcu_read_lock() here, so perhaps only do get_task_struct() before
doing rcu_read_unlock() and we have a non-NULL t?

> @@ -191,6 +193,7 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
> test_bit(MMF_OOM_SKIP, &p->mm->flags) ||
> in_vfork(p)) {
> task_unlock(p);
> + put_task_struct(p);
> return 0;
> }
>
> @@ -208,7 +211,7 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
> */
> if (has_capability_noaudit(p, CAP_SYS_ADMIN))
> points -= (points * 3) / 100;
> -
> + put_task_struct(p);
> /* Normalize to oom_score_adj units */
> adj *= totalpages / 1000;
> points += adj;

This fixes up oom_badness(), but there are other users of
find_lock_task_mm() in the oom killer as well as other subsystems.