Re: [PATCH] fgraph: Convert ret_stack tasklist scanning to rcu

From: Steven Rostedt
Date: Sat Sep 19 2020 - 09:56:25 EST


On Sat, 19 Sep 2020 13:24:39 +0200
Oleg Nesterov <oleg@xxxxxxxxxx> wrote:

> On 09/18, Steven Rostedt wrote:
> >
> > On Mon, 7 Sep 2020 13:43:02 +0200
> > Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> >
> > > Afaics, with or without this change alloc_retstack_tasklist() can race
> > > with copy_process() and miss the new child; ftrace_graph_init_task()
> > > can't help, ftrace_graph_active can be set right after the check and
> > > for_each_process_thread() can't see the new process yet.
> >
> > There's a call in copy_process(): ftrace_graph_init_task() that initializes
> > a new tasks ret_stack,
>
> Only if ftrace_graph_active != 0.
>
> register_ftrace_graph() can increment ftrace_graph_active and call
> alloc_retstack_tasklist() right after ftrace_graph_init_task() checks
> ftrace_graph_active.
>
> > and this loop will ignore it
>
> and this loop won't see it unless the forking process finishes copy_process()
> and does list_add_tail_rcu(&p->tasks, &init_task.tasks) which makes it
> visible to for_each_process(). Yes, this is very unlikely.

Ah, I see what you mean. Hmm, not sure the best way to fix this. It
would be very rare to trigger and the only thing that it would do is
not to trace the new task. But that could be frustrating if that
happens. I guess we could put the hook after it gets added to the list,
and use a cmpxchg to update the ret_stack, to make sure we don't race
on initialization and copy_process.

-- Steve