Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

From: Steven Rostedt
Date: Tue Aug 07 2018 - 22:29:03 EST


On Tue, 7 Aug 2018 19:13:32 -0700
Joel Fernandes <joelaf@xxxxxxxxxx> wrote:

> On Tue, Aug 7, 2018 at 6:55 PM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> > On Tue, 7 Aug 2018 18:17:42 -0700
> > Joel Fernandes <joelaf@xxxxxxxxxx> wrote:
> >
> >> From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001
> >> From: "Joel Fernandes (Google)" <joel@xxxxxxxxxxxxxxxxx>
> >> Date: Sun, 5 Aug 2018 20:17:41 -0700
> >> Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline
> >>
> >> Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline")
> >> causes a problem for lockdep using tracepoint code. Once a CPU is
> >> offline, tracepoints donot get called, however this causes a big problem
> >> for lockdep probes that need to run so that IRQ annotations are marked
> >> correctly.
> >>
> >> A race is possible where while the CPU is going offline, an interrupt
> >> can come in and then a lockdep assert causes an annotation warning:
> >>
> >> [ 106.551354] IRQs not enabled as expected
> >> [ 106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982
> >> tick_nohz_idle_enter+0x99/0xb0
> >> [ 106.552964] Modules linked in:
> >> [ 106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W
> >>
> >> We need tracepoints to run as late as possible. This commit fixes the
> >> issue by removing the cpu_online check in tracepoint code that was
> >> introduced in the mentioned commit, however we now switch to using SRCU
> >> for all tracepoints and special handle calling tracepoints from NMI so
> >> that we don't run into issues that result from using sched-RCU when the
> >> CPUs are marked to be offline.
> >>
> >> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and
> >> unify their usage")
> >> Reported-by: Masami Hiramatsu <mhiramat@xxxxxxxxxx>
> >> Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
> >
> >
> > The above change log doesn't look like it matches the NMI patch.
> >
> > Can you resend with just the NMI changes? I already handled the cpu
> > offline ones.
>
> Ok, sent with "cpu offline" changes dropped, here it is:
> https://lore.kernel.org/patchwork/patch/972657/
>
> If you could add your Reported-by to it, that would be great as well.
>
> >
> > But I may still have concerns with this patch.
>
> Ok let me know what you think.
>

Not sure you saw this part of my reply:

> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
> } while ((++it_func_ptr)->func); \
> } \
> \
> - if (rcuidle) \
> - srcu_read_unlock_notrace(&tracepoint_srcu, idx);\
> + srcu_read_unlock_notrace(ss, idx); \

Hmm, why do we have the two different srcu handles?

Thinking about this, if this can be called by the "thunk" code, then
there might be an issue. I think the "thunk" code can be called before
in_nmi() is set, so we don't know if we are in an NMI or not. I need to
look at that code to make sure. If in_nmi() still works, then we should
use the _nmi srcu handle (if that's required).

But I'm not sure using SRCU for all tracepoints is needed at this
moment. I'll have to look deeper into this tomorrow. But it's getting
close to the merge window, and this needs to be settled quick. Another
"partial revert" may be needed until this gets settled.


-- Steve