Re: [PATCH v2] umh: fix out of scope usage when the process is being killed

From: Luis Chamberlain
Date: Wed Dec 14 2022 - 15:06:06 EST


On Wed, Dec 14, 2022 at 09:46:56PM +0800, Schspa Shi wrote:
> When the process is killed, wait_for_completion_state will return with
> -ERESTARTSYS, and the completion variable in the stack will be unavailable,
> even freed. If the user-mode thread is complete at the same time, there
> will be a race to use a unavailable variable.
>
> Please refer to the following scenarios.
> T1 T2
> ------------------------------------------------------------------
> call_usermodehelper_exec
> call_usermodehelper_exec_async
> << do something >>
> umh_complete(sub_info);
> comp = xchg(&sub_info->complete, NULL);
> /* we got the completion */
> << context switch >>
>
> << Being killed >>
> retval = wait_for_completion_state(sub_info->complete, state);
> if (!retval)
> goto wait_done;
>
> if (wait & UMH_KILLABLE) {
> /* umh_complete() will see NULL and free sub_info */
> if (xchg(&sub_info->complete, NULL))
> goto unlock;
> << we can't got the completion, because T2 take it already >>
> }
> ....
> return retval;
> }
>
> /**
> * the completion variable in stack is end of life cycle.
> * and maybe freed due to process is recycled.
> */
> -------- BUG here----------
> if (comp)
> complete(comp);
>
> To fix it, we can add an additional wait_for_completion to ensure the
> completion object is completely unused. And this is what
> kthread_create_on_node does to handle this race.
>
> Reported-by: syzbot+10d19d528d9755d9af22@xxxxxxxxxxxxxxxxxxxxxxxxx
> Reported-by: syzbot+70d5d5d83d03db2c813d@xxxxxxxxxxxxxxxxxxxxxxxxx
> Reported-by: syzbot+83cb0411d0fcf0a30fc1@xxxxxxxxxxxxxxxxxxxxxxxxx
> Reported-by: syzbot+c92c6a251d49ceceb625@xxxxxxxxxxxxxxxxxxxxxxxxx
> Signed-off-by: Schspa Shi <schspa@xxxxxxxxx>
> ---

Please fix the commit log a bit more with the cotext I provided, *if*
on the other thread the community agrees with the approach to be
compartamentalized. I am still not sure why this would fix the
UAF after thinking about it some more, and the issue would mean
there likely could be a generic fix / issue to consider.

So for now I'd like more review of this race and the proposed fix
as I mentioned in the follow-up threaad in your v1 patch. Let's
follow up there and see how that discussion goes.

Luis