Re: [PATCH -printk] printk, tracing: fix console tracepoint
From: Paul E. McKenney
Date: Tue Jul 12 2022 - 07:50:03 EST
On Mon, Jul 11, 2022 at 07:57:01PM -0700, Paul E. McKenney wrote:
> On Mon, Jul 11, 2022 at 08:53:19PM -0400, Steven Rostedt wrote:
> > On Mon, 11 Jul 2022 17:21:28 -0700
> > "Paul E. McKenney" <paulmck@xxxxxxxxxx> wrote:
> > > On x86, both srcu_read_lock() and srcu_read_unlock() should be OK from
> > > NMI context, give or take their use of lockdep. Which is why we have
> > > srcu_read_lock_notrace() and srcu_read_unlock_notrace(), which do not
> > > use lockdep. Which __DO_TRACE() does in fact invoke. Ah, but you have
> > > this: "WARN_ON_ONCE(rcuidle && in_nmi())".
> > >
> > > Because all the world is not an x86.
> > But since NMIs are architecture specific, we could change that to:
> > WARN_ON_ONCE(!srcu_nmi_safe && rcuidle && in_nmi());
> > and add a srcu_nmi_safe constant or macro that is 1 on architectures that
> > srcu is safe in NMI and 0 otherwise.
> > Or do we care if a tracepoint happens in those architectures where it is
> > not safe. We could then just do:
> > if (!srcu_nmi_safe && rcuidle && in_nmi())
> > return;
> > and just skip tracepoints that are marked rcu_idle and happen within NMI.
> More generally, this is this_cpu_nmi_safe rather than just SRCU. Or could
> be, depending on what the architecture guys would like to guarantee.
> SRCU is just passing through the this_cpu*() NMI-safety property.
> And in addition to in_nmi(), there is the HAVE_NMI Kconfig option:
> $ git grep -w HAVE_NMI
> arch/Kconfig:config HAVE_NMI
> arch/Kconfig: depends on HAVE_NMI
> arch/arm/Kconfig: select HAVE_NMI
> arch/arm64/Kconfig: select HAVE_NMI
> arch/loongarch/Kconfig: select HAVE_NMI
> arch/mips/Kconfig: select HAVE_NMI
> arch/powerpc/Kconfig: select HAVE_NMI if PERF_EVENTS || (PPC64 && PPC_BOOK3S)
> arch/s390/Kconfig: select HAVE_NMI
> arch/sh/Kconfig: select HAVE_NMI
> arch/sparc/Kconfig: select HAVE_NMI
> arch/x86/Kconfig: select HAVE_NMI
> So if I understand correctly, arm, loongarch, mips, powerpc, sh, and
> sparc are the ones that have NMIs but don't have NMI-safe this_cpu*()
> operations. These are the ones that would need !srcu_nmi_safe.
> Or, longer term, I could make HAVE_NMI && !srcu_safe_nmi cause SRCU to use
> explicit atomics for srcu_read_lock_trace() and srcu_read_unlock_trace().
> I am assuming that any NMI handler executing srcu_read_lock_trace()
> also executes the matching srcu_read_unlock_trace(). (Silly me, I know!)
> This assumption means that srcu_read_lock() and srcu_read_unlock() can
> continue with their non-atomic this_cpu_inc() ways.
> But a quick fix that stopped the bleeding and allowed printk() to
> progress would be useful in the short term, regardless of whether or
> not in the longer term it makes sense to make srcu_read_lock_trace()
> and srcu_read_unlock_trace() NMI-safe.
Except that doesn't rcuidle && in_nmi() imply a misplaced trace event?
Isn't it still the case that you are not supposed to have trace events
in NMI handlers before RCU is watching or after it is no longer watching,
just as for entry/exit code in general? Once in the body of the handler,
rcuidle should be false and all should be well.
Or am I missing something here?