Re: [PATCH] fgraph: Convert ret_stack tasklist scanning to rcu

From: Steven Rostedt
Date: Fri Sep 18 2020 - 13:12:05 EST

[ Back from my PTO and still digging out emails ]

On Mon, 7 Sep 2020 13:43:02 +0200
Oleg Nesterov <oleg@xxxxxxxxxx> wrote:

> On 09/06, Davidlohr Bueso wrote:
> >
> > Here tasklist_lock does not protect anything other than the list
> > against concurrent fork/exit. And considering that the whole thing
> > is capped by FTRACE_RETSTACK_ALLOC_SIZE (32), it should not be a
> > problem to have a pontentially stale, yet stable, list. The task cannot
> > go away either, so we don't risk racing with ftrace_graph_exit_task()
> > which clears the retstack.
> I don't understand this code but I think you right, tasklist_lock buys
> nothing.

When I first wrote this code, I didn't want to take tasklist_lock, but
there was questions if rcu_read_lock() was enough. And since this code is
far from a fast path, I decided it was better to be safe than sorry, and
took the tasklist_lock as a paranoid measure.

> Afaics, with or without this change alloc_retstack_tasklist() can race
> with copy_process() and miss the new child; ftrace_graph_init_task()
> can't help, ftrace_graph_active can be set right after the check and
> for_each_process_thread() can't see the new process yet.

There's a call in copy_process(): ftrace_graph_init_task() that initializes
a new tasks ret_stack, and this loop will ignore it because it first checks
to see if the task has a ret_stack before adding one to it. And the child
gets one before being added to the list.

> This can't race with ftrace_graph_exit_task(), it is called after the
> full gp pass. But this function looks very confusing to me, I don't
> understand the barrier and the "NULL must become visible to IRQs before
> we free it" comment.

Probably not needed, but again, being very paranoid, as to not crash
anything. If this is called on a task that is running, and an interrupt
comes in after it is freed, but before the ret_stack variable is set to
NULL, then it will try to use it. I don't think this is possible, but it
may have been in the past.

> Looks like, ftrace_graph_exit_task() was called by the exiting task
> in the past? Indeed, see 65afa5e603d50 ("tracing/function-return-tracer:
> free the return stack on free_task()"). I think it makes sense to
> simplify this function now, it can simply do kfree(t->ret_stack) and
> nothing more.

Ah, yeah, then you are right. If it can't be called on a running task then
it can be simplified. Probably need a:


just to make sure this never happens.

> ACK, but ...
> > @@ -387,8 +387,8 @@ static int alloc_retstack_tasklist(struct ftrace_ret_stack **ret_stack_list)
> > }
> > }
> >
> > - read_lock(&tasklist_lock);
> then you should probably rename alloc_retstack_tasklist() ?

tasklist, process thead? Is there a difference?

Thanks for reviewing this!

-- Steve