Re: [PATCH] fork/pid: Fix use-after-free in __task_pid_nr_ns
From: Qing Wang
Date: Tue Jan 06 2026 - 05:06:45 EST
On Tue, 06 Jan 2026 at 17:04, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> Sorry, this description very confusing to me... Is it Task B who does
> clone? Or another Task A does copy_process() ? Could you write a more
> clear changelog?
The "<---...---clone" graph may have misled you. What I meant was that
Task A is cloned from Task B.
The modified bug timeline with explanation:
Task B
perf_event_open()
Task A <--------------------------- clone()
copy_process()
perf_event_init_task()
...
one copy failed
free_signal_struct()
close(event_fd)
perf_child_detach()
__task_pid_nr_ns()
access child task->signal
perf_event_init_task()
1. Task B create perf events by perf_event_open().
2. Task B clone Task A, and Task A have perf events copied from Task B in
this clone().
3. Task A do one clone and fail to copy one(eg. copy_mm) in
copy_process(), then goto cleanup free_signal_struct().
4. Task B do close(event_fd), and access Task A's signal after
free_signal_struct() and before perf_event_init_task() in Task A.
> At first glance this is racy. Can't task->signal be freed right after
> the check?
>
> And... Can't we make another fix? If copy_process() fails and does
> free_signal_struct(), the child has not been added to rcu protected
> lists and init_task_pid(child) was not called yet.
>
> So perhaps something like the patch below can work?
>
> Oleg.
> ---
>
> --- x/kernel/events/core.c
> +++ x/kernel/events/core.c
> @@ -1422,16 +1422,17 @@ unclone_ctx(struct perf_event_context *c
> static u32 perf_event_pid_type(struct perf_event *event, struct task_struct *p,
> enum pid_type type)
> {
> - u32 nr;
> + u32 nr = 0;
> /*
> * only top level events have the pid namespace they were created in
> */
> if (event->parent)
> event = event->parent;
>
> - nr = __task_pid_nr_ns(p, type, event->ns);
> + if (pid_alive(p))
> + nr = __task_pid_nr_ns(p, type, event->ns);
> /* avoid -1 if it is idle thread or runs in another ns */
> - if (!nr && !pid_alive(p))
> + if (!nr)
> nr = -1;
> return nr;
> }
I think it doesn't work, as I explained in my previous reply to Andrew:
A newly created task should not be visible to other CPUs during
creation: The perf subsystem copies the parent’s events
to the child during copy_process(). Later, when the parent closes
its own perf event, it may traverse child events and access
child_ctx->task->signal. This means that a child process that has not
yet been fully created can be referenced by other CPUs.