Re: [PATCH] umh: fix UAF when the process is being killed

From: Schspa Shi
Date: Mon Dec 12 2022 - 08:47:27 EST



Schspa Shi <schspa@xxxxxxxxx> writes:

> Luis Chamberlain <mcgrof@xxxxxxxxxx> writes:
>
>> On Mon, Dec 05, 2022 at 07:38:21PM +0800, Schspa Shi wrote:
>>>
>>> Schspa Shi <schspa@xxxxxxxxx> writes:
>>>
>>> > When the process is killed, wait_for_completion_state will return with
>>> > -ERESTARTSYS, and the completion variable in the stack will be freed.
>>
>> There is no free'ing here, it's just not availabel anymore, which is
>> different.
>>
>
> No, the whole thread stack will be freed when the process died. There
> will be some cases where it will be released. It will be more accurate
> to modify it to be unavailable.
>
>>> > If the user-mode thread is complete at the same time, there will be a UAF.
>>> >
>>> > Please refer to the following scenarios.
>>> > T1 T2
>>> > ------------------------------------------------------------------
>>> > call_usermodehelper_exec
>>> > call_usermodehelper_exec_async
>>> > << do something >>
>>> > umh_complete(sub_info);
>>> > comp = xchg(&sub_info->complete, NULL);
>>> > /* we got the completion */
>>> > << context switch >>
>
> The sub_info->complete will be set to NULL.
>
>>> >
>>> > << Being killed >>
>>> > retval = wait_for_completion_state(sub_info->complete, state);
>>> > if (!retval)
>>> > goto wait_done;
>>> >
>>> > if (wait & UMH_KILLABLE) {
>>> > /* umh_complete() will see NULL and free sub_info */
>>> > if (xchg(&sub_info->complete, NULL))
>>> > goto unlock;
>>> > << we can't got the completion >>
>>
>> I'm sorry I don't understand what you tried to say here, we can't got?
>>
>
> In this scenario, the sub_info->complete will be NULL, at the
> call_usermodehelper_exec_async, and we will go to the unlock branch here.
>
>>> > }
>>> > ....
>>> > unlock:
>>> > helper_unlock();
>>> > return retval;
>>> > }
>>> >
>>> > /**
>>> > * the completion variable in stack is end of life cycle.
>>> > * and maybe freed due to process is recycled.
>>> > */
>>> > --------UAF here----------
>>> > if (comp)
>>> > complete(comp);
>>> >
>>> > To fix it, we can put the completion variable in the subprocess_info
>>> > variable.
>>> >
>>> > Reported-by: syzbot+10d19d528d9755d9af22@xxxxxxxxxxxxxxxxxxxxxxxxx
>>> > Reported-by: syzbot+70d5d5d83d03db2c813d@xxxxxxxxxxxxxxxxxxxxxxxxx
>>> > Reported-by: syzbot+83cb0411d0fcf0a30fc1@xxxxxxxxxxxxxxxxxxxxxxxxx
>>> >
>>> > Signed-off-by: Schspa Shi <schspa@xxxxxxxxx>
>>> > ---
>>> > include/linux/umh.h | 1 +
>>> > kernel/umh.c | 6 +++---
>>> > 2 files changed, 4 insertions(+), 3 deletions(-)
>>> >
>>> > diff --git a/include/linux/umh.h b/include/linux/umh.h
>>> > index 5d1f6129b847..801f7efbc825 100644
>>> > --- a/include/linux/umh.h
>>> > +++ b/include/linux/umh.h
>>> > @@ -20,6 +20,7 @@ struct file;
>>> > struct subprocess_info {
>>> > struct work_struct work;
>>> > struct completion *complete;
>>> > + struct completion done;
>>> > const char *path;
>>> > char **argv;
>>> > char **envp;
>>> > diff --git a/kernel/umh.c b/kernel/umh.c
>>> > index 850631518665..3ed39956c777 100644
>>> > --- a/kernel/umh.c
>>> > +++ b/kernel/umh.c
>>> > @@ -380,6 +380,7 @@ struct subprocess_info *call_usermodehelper_setup(const char *path, char **argv,
>>> > sub_info->cleanup = cleanup;
>>> > sub_info->init = init;
>>> > sub_info->data = data;
>>> > + init_completion(&sub_info->done);
>>> > out:
>>> > return sub_info;
>>> > }
>>> > @@ -405,7 +406,6 @@ EXPORT_SYMBOL(call_usermodehelper_setup);
>>> > int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
>>> > {
>>> > unsigned int state = TASK_UNINTERRUPTIBLE;
>>> > - DECLARE_COMPLETION_ONSTACK(done);
>>> > int retval = 0;
>>> >
>>> > if (!sub_info->path) {
>>> > @@ -431,7 +431,7 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
>>> > * This makes it possible to use umh_complete to free
>>> > * the data structure in case of UMH_NO_WAIT.
>>> > */
>>> > - sub_info->complete = (wait == UMH_NO_WAIT) ? NULL : &done;
>>> > + sub_info->complete = (wait == UMH_NO_WAIT) ? NULL : &sub_info->done;
>>> > sub_info->wait = wait;
>>> >
>>> > queue_work(system_unbound_wq, &sub_info->work);
>>> > @@ -444,7 +444,7 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
>>> > if (wait & UMH_FREEZABLE)
>>> > state |= TASK_FREEZABLE;
>>> >
>>> > - retval = wait_for_completion_state(&done, state);
>>> > + retval = wait_for_completion_state(sub_info->complete, state);
>>> > if (!retval)
>>> > goto wait_done;
>>>
>>> Hi Luis Chamberlain:
>>>
>>> Could you help to review this patch? I'm not sure why we define the
>>> amount of completion here on the stack. But this UAF can be fixed by
>>> moving the completion variable to the heap.
>>
>> It would seem to me that if this is an issue other areas would have
>> similar races as well, so I was hard pressed about the approach / fix.
>>
>
> I think other modules will have similar bugs, but this is a limitation
> on the use of the DECLARE_COMPLETION_ONSTACK macro, and it has been
> specifically stated in the completion's documentation.
>
> There is the description from completion's documentation:
>
> Note that when using completion objects as local variables you must be
> acutely aware of the short life time of the function stack: the function
> must not return to a calling context until all activities (such as waiting
> threads) have ceased and the completion object is completely unused.
>
>> Wouldn't something like this be a bit more explicit about ensuring
>> we don't let the work item race beyond?
>>
>> diff --git a/kernel/umh.c b/kernel/umh.c
>> index 850631518665..55fc698115a7 100644
>> --- a/kernel/umh.c
>> +++ b/kernel/umh.c
>> @@ -447,6 +447,8 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
>> retval = wait_for_completion_state(&done, state);
>> if (!retval)
>> goto wait_done;
>> + else if (retval == -ERESTARTSYS)
>> + cancel_work_sync(&sub_info->work);
>>
>
> I think this modification will make UMH_KILLABLE useless, we have to
> wait for this task to complete, even if it is killed.
>
>> if (wait & UMH_KILLABLE) {
>> /* umh_complete() will see NULL and free sub_info */

Hi Luis Chamberlain:

I checked the code from __kthread_create_on_node, and we can fix this
with the following change too.

I'd like to upload a V2 patch with the new solution if you prefer the
following way.

diff --git a/kernel/umh.c b/kernel/umh.c
index 850631518665..8023f11fcfc0 100644
--- a/kernel/umh.c
+++ b/kernel/umh.c
@@ -452,6 +452,11 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
/* umh_complete() will see NULL and free sub_info */
if (xchg(&sub_info->complete, NULL))
goto unlock;
+ /*
+ * kthreadd (or new kernel thread) will call complete()
+ * shortly.
+ */
+ wait_for_completion(&done);
}

wait_done:

--
BRs
Schspa Shi