Re: [PATCH] oom: fix the unsafe proc_oom_score()->badness() call

From: Oleg Nesterov
Date: Wed Mar 31 2010 - 16:20:11 EST


On 03/31, Oleg Nesterov wrote:
>
> On 03/30, David Rientjes wrote:
> >
> > On Tue, 30 Mar 2010, Oleg Nesterov wrote:
> >
> > > proc_oom_score(task) have a reference to task_struct, but that is all.
> > > If this task was already released before we take tasklist_lock
> > >
> > > - we can't use task->group_leader, it points to nowhere
> > >
> > > - it is not safe to call badness() even if this task is
> > > ->group_leader, has_intersects_mems_allowed() assumes
> > > it is safe to iterate over ->thread_group list.
> > >
> > > Add the pid_alive() check to ensure __unhash_process() was not called.
> > >
> > > Note: I think we shouldn't use ->group_leader, badness() should return
> > > the same result for any sub-thread. However this is not true currently,
> > > and I think that ->mm check and list_for_each_entry(p->children) in
> > > badness are not right.
> > >
> >
> > I think it would be better to just use task and not task->group_leader.
>
> Sure, agreed. I preserved ->group_leader just because I didn't understand
> why the current code doesn't use task. But note that pid_alive() is still
> needed.

Oh. No, with the current code in -mm pid_alive() is not needed if
we use task instead of task->group_leader. But once we fix
oom_forkbomb_penalty() it will be needed again.


But. Oh well. David, oom-badness-heuristic-rewrite.patch changed badness()
to consult p->signal->oom_score_adj. Until recently this was wrong when it
is called from proc_oom_score().

This means oom-badness-heuristic-rewrite.patch depends on
signals-make-task_struct-signal-immutable-refcountable.patch, or we
need the pid_alive() check again.



oom_badness() gets the new argument, long totalpages, and the callers
were updated. However, long uptime is not used any longer, probably
it make sense to kill this arg and simplify the callers? Unless you
are going to take run-time into account later.

So, I think -mm needs the patch below, but I have no idea how to
write the changelog ;)

Oleg.

--- x/fs/proc/base.c
+++ x/fs/proc/base.c
@@ -430,12 +430,13 @@ static const struct file_operations proc
/* The badness from the OOM killer */
static int proc_oom_score(struct task_struct *task, char *buffer)
{
- unsigned long points;
+ unsigned long points = 0;
struct timespec uptime;

do_posix_clock_monotonic_gettime(&uptime);
read_lock(&tasklist_lock);
- points = oom_badness(task->group_leader,
+ if (pid_alive(task))
+ points = oom_badness(task,
global_page_state(NR_INACTIVE_ANON) +
global_page_state(NR_ACTIVE_ANON) +
global_page_state(NR_INACTIVE_FILE) +

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/