Re: [PATCH] umh: fix UAF when the process is being killed

From: Luis Chamberlain
Date: Wed Dec 14 2022 - 15:04:30 EST


Peter, Ingo, Steven would like you're review.

On Tue, Dec 13, 2022 at 03:03:53PM -0800, Luis Chamberlain wrote:
> On Mon, Dec 12, 2022 at 09:38:31PM +0800, Schspa Shi wrote:
> > I'd like to upload a V2 patch with the new solution if you prefer the
> > following way.
> >
> > diff --git a/kernel/umh.c b/kernel/umh.c
> > index 850631518665..8023f11fcfc0 100644
> > --- a/kernel/umh.c
> > +++ b/kernel/umh.c
> > @@ -452,6 +452,11 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
> > /* umh_complete() will see NULL and free sub_info */
> > if (xchg(&sub_info->complete, NULL))
> > goto unlock;
> > + /*
> > + * kthreadd (or new kernel thread) will call complete()
> > + * shortly.
> > + */
> > + wait_for_completion(&done);
> > }
>
> Yes much better. Did you verify it fixes the splat found by the bots?

Wait, I'm not sure yet why this would fix it... I first started thinking
that this may be a good example of a Coccinelle SmPL rule, something like:

DECLARE_COMPLETION_ONSTACK(done);
foo *foo;
...
foo->completion = &done;
...
queue_work(system_unbound_wq, &foo->work);
....
ret = wait_for_completion_state(&done, state);
...
if (!ret)
S
...
+wait_for_completion(&done);

But that is pretty complex, and while it may be useful to know how many
patterns we have like this, it begs the question if generalizing this
inside the callers is best for -ERESTARTSYS condition is best. What
do folks think?

The rationale here is that if you queue stuff and give access to the
completion variable but its on-stack obviously you can end up with the
queued stuff complete() on a on-stack variable. The issue seems to
be that wait_for_completion_state() for -ERESTARTSYS still means
that the already scheduled queue'd work is *about* to run and
the process with the completion on-stack completed. So we race with
the end of the routine and the completion on-stack.

It makes me wonder if wait_for_completion() above really is doing
something more, if it is just helping with timing and is still error
prone.

The queued work will try the the completion as follows:

static void umh_complete(struct subprocess_info *sub_info)
{
struct completion *comp = xchg(&sub_info->complete, NULL);
/*
* See call_usermodehelper_exec(). If xchg() returns NULL
* we own sub_info, the UMH_KILLABLE caller has gone away
* or the caller used UMH_NO_WAIT.
*/
if (comp)
complete(comp);
else
call_usermodehelper_freeinfo(sub_info);
}

So the race is getting -ERESTARTSYS on the process with completion
on-stack and the above running complete(comp). Why would sprinkling
wait_for_completion(&done) *after* wait_for_completion_state(&done, state)
fix this UAF?
}
diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c
index d57a5c1c1cd9..aa7031faca04 100644
--- a/kernel/sched/completion.c
+++ b/kernel/sched/completion.c
@@ -205,8 +205,10 @@ int __sched wait_for_completion_interruptible(struct completion *x)
{
long t = wait_for_common(x, MAX_SCHEDULE_TIMEOUT, TASK_INTERRUPTIBLE);

- if (t == -ERESTARTSYS)
+ if (t == -ERESTARTSYS) {
+ wait_for_completion(x);
return t;
+ }
return 0;
}
EXPORT_SYMBOL(wait_for_completion_interruptible);
@@ -243,8 +245,10 @@ int __sched wait_for_completion_killable(struct completion *x)
{
long t = wait_for_common(x, MAX_SCHEDULE_TIMEOUT, TASK_KILLABLE);

- if (t == -ERESTARTSYS)
+ if (t == -ERESTARTSYS) {
+ wait_for_completion(x);
return t;
+ }
return 0;
}
EXPORT_SYMBOL(wait_for_completion_killable);
@@ -253,8 +257,10 @@ int __sched wait_for_completion_state(struct completion *x, unsigned int state)
{
long t = wait_for_common(x, MAX_SCHEDULE_TIMEOUT, state);

- if (t == -ERESTARTSYS)
+ if (t == -ERESTARTSYS) {
+ wait_for_completion(x);
return t;
+ }
return 0;
}
EXPORT_SYMBOL(wait_for_completion_state);