Re: [for-next][PATCH 4/4] ftrace: Add comment to why rcu_dereference_sched() is open coded

From: Joel Fernandes
Date: Wed Feb 05 2020 - 11:08:28 EST


On Wed, Feb 05, 2020 at 10:49:45AM -0500, Steven Rostedt wrote:
> On Wed, 5 Feb 2020 10:42:12 -0500
> Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
>
> > On Wed, Feb 05, 2020 at 09:28:47AM -0500, Steven Rostedt wrote:
> > > On Wed, 5 Feb 2020 09:19:15 -0500
> > > Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
> > >
> > > > Could you paste the stack here when RCU is not watching? In trace event code
> > > > IIRC we call rcu_enter_irqs_on() to have RCU temporarily watch, since that
> > > > code can be called from idle loop. Should we doing the same here as well?
> > >
> > > Unfortunately I lost the stack trace. And the last time we tried to use
> > > rcu_enter_irqs_on() for ftrace, we couldn't find a way to do this
> > > properly. Ftrace is much more invasive then going into idle. The
> > > problem is that ftrace traces RCU itself, and calling
> > > "rcu_enter_irqs_on()" in pretty much any place in the RCU code caused
> > > lots of bugs ;-)
> > >
> > > This is why we have the schedule_on_each_cpu(ftrace_sync) hack.
> >
> > The "schedule a task on each CPU" trick works on !PREEMPT though right?
>
> It works on both, as I care more about the PREEMPT=y case then
> the !PREEMPT, and the PREEMPT_RT which is even more preemptive than
> PREEMPT!
>
> >
> > Because it is possible in PREEMPT=y to get preempted in the middle of a
> > read-side critical section, switch to the worker thread executing the
> > ftrace_sync() and then switch back. But RCU still has to watch that CPU since
> > the read-side critical section was not completed.
> >
> > Or is there a subtlety here with ftrace that I missed?
> >
>
> Hence Amol's patch:
>
> > + notrace_hash = rcu_dereference_protected(ftrace_graph_notrace_hash,
> > + !preemptible());
>
> It checks to make sure preemption is off. There is no chance of being
> preempted in the read side critical section.

Yes, this makes sense. Sorry for the noise. For "sched" RCU cases,
scheduling on each CPU would work regardless of PREEMPT configuration.

( I guess I was confusing this case with the non-sched RCU usages (such as using
rcu_read_lock()) where scheduling a task on each CPU obviously would not work
with PREEMPT=y. )

By the way would SRCU not work instead of the ftrace_sync() technique? Or is
the concern that SRCU cannot be used from NMI?

thanks,

- Joel