[PATCH v3 09/22] sched,rcu,tracing: Avoid tracing before in_nmi() is correct

From: Peter Zijlstra
Date: Wed Feb 19 2020 - 10:14:20 EST


If we call into a tracer before in_nmi() becomes true, the tracer can
no longer detect it is called from NMI context and behave correctly.

Therefore change nmi_{enter,exit}() to use __preempt_count_{add,sub}()
as the normal preempt_count_{add,sub}() have a (desired) function
trace entry.

This fixes a potential issue with current code; AFAICT when the
function-tracer has stack-tracing enabled __trace_stack() will
malfunction when it hits the preempt_count_add() function entry from
NMI context.

Suggested-by: Steven Rostedt (VMware) <rosted@xxxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
include/linux/hardirq.h | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -65,6 +65,15 @@ extern void irq_exit(void);
#define arch_nmi_exit() do { } while (0)
#endif

+/*
+ * NMI vs Tracing
+ * --------------
+ *
+ * We must not land in a tracer until (or after) we've changed preempt_count
+ * such that in_nmi() becomes true. To that effect all NMI C entry points must
+ * be marked 'notrace' and call nmi_enter() as soon as possible.
+ */
+
#define nmi_enter() \
do { \
arch_nmi_enter(); \
@@ -72,7 +81,7 @@ extern void irq_exit(void);
lockdep_off(); \
ftrace_nmi_enter(); \
BUG_ON(in_nmi() == NMI_MASK); \
- preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET); \
+ __preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET); \
rcu_nmi_enter(); \
trace_hardirq_enter(); \
} while (0)
@@ -82,7 +91,7 @@ extern void irq_exit(void);
trace_hardirq_exit(); \
rcu_nmi_exit(); \
BUG_ON(!in_nmi()); \
- preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET); \
+ __preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET); \
ftrace_nmi_exit(); \
lockdep_on(); \
printk_nmi_exit(); \