Re: [patch] oom: give current access to memory reserves if it hasbeen killed

From: David Rientjes
Date: Thu Apr 01 2010 - 04:35:17 EST


On Wed, 31 Mar 2010, Oleg Nesterov wrote:

> Probably something like the patch below makes sense. Note that
> "skip kernel threads" logic is wrong too, we should check PF_KTHREAD.
> Probably it is better to check it in select_bad_process() instead,
> near is_global_init().
>

is_global_init() will be true for p->flags & PF_KTHREAD.

> The new helper, find_lock_task_mm(), should be used by
> oom_forkbomb_penalty() too.
>
> dump_tasks() doesn't need it, it does do_each_thread(). Cough,
> __out_of_memory() and out_of_memory() call it without tasklist.
> We are going to panic() anyway, but still.
>

Indeed, good observation.

> Oleg.
>
> --- x/mm/oom_kill.c
> +++ x/mm/oom_kill.c
> @@ -129,6 +129,19 @@ static unsigned long oom_forkbomb_penalt
> (child_rss / sysctl_oom_forkbomb_thres) : 0;
> }
>
> +static find_lock_task_mm(struct task_struct *p)
> +{
> + struct task_struct *t = p;
> + do {
> + task_lock(t);
> + if (likely(t->mm && !(t->flags & PF_KTHREAD)))
> + return t;
> + task_unlock(t);
> + } while_each_thred(p, t);
> +
> + return NULL;
> +}
> +
> /**
> * oom_badness - heuristic function to determine which candidate task to kill
> * @p: task struct of which task we should calculate
> @@ -159,13 +172,9 @@ unsigned int oom_badness(struct task_str
> if (p->flags & PF_OOM_ORIGIN)
> return 1000;
>
> - task_lock(p);
> - mm = p->mm;
> - if (!mm) {
> - task_unlock(p);
> + p = find_lock_task_mm(p);
> + if (!p)
> return 0;
> - }
> -
> /*
> * The baseline for the badness score is the proportion of RAM that each
> * task's rss and swap space use.
> @@ -330,12 +339,6 @@ static struct task_struct *select_bad_pr
> *ppoints = 1000;
> }
>
> - /*
> - * skip kernel threads and tasks which have already released
> - * their mm.
> - */
> - if (!p->mm)
> - continue;
> if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
> continue;

You can't do this for the reason I cited in another email, oom_badness()
returning 0 does not exclude a task from being chosen by
selcet_bad_process(), it will use that task if nothing else has been found
yet. We must explicitly filter it from consideration by checking for
!p->mm.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/