[PATCH v4 12/27] sched,rcu,tracing: Avoid tracing before in_nmi() is correct

From: Peter Zijlstra
Date: Fri Feb 21 2020 - 08:51:30 EST


If we call into a tracer before in_nmi() becomes true, the tracer can
no longer detect it is called from NMI context and behave correctly.

Therefore change nmi_{enter,exit}() to use __preempt_count_{add,sub}()
as the normal preempt_count_{add,sub}() have a (desired) function
trace entry.

This fixes a potential issue with current code; AFAICT when the
function-tracer has stack-tracing enabled __trace_stack() will
malfunction when it hits the preempt_count_add() function entry from
NMI context.

Suggested-by: Steven Rostedt (VMware) <rosted@xxxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Reviewed-by: Steven Rostedt (VMware) <rostedt@xxxxxxxxxxx>
---
include/linux/hardirq.h | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -66,6 +66,15 @@ extern void irq_exit(void);
#endif

/*
+ * NMI vs Tracing
+ * --------------
+ *
+ * We must not land in a tracer until (or after) we've changed preempt_count
+ * such that in_nmi() becomes true. To that effect all NMI C entry points must
+ * be marked 'notrace' and call nmi_enter() as soon as possible.
+ */
+
+/*
* nmi_enter() can nest up to 15 times; see NMI_BITS.
*/
#define nmi_enter() \
@@ -75,7 +84,7 @@ extern void irq_exit(void);
lockdep_off(); \
ftrace_nmi_enter(); \
BUG_ON(in_nmi() == NMI_MASK); \
- preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET); \
+ __preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET); \
rcu_nmi_enter(); \
trace_hardirq_enter(); \
} while (0)
@@ -85,7 +94,7 @@ extern void irq_exit(void);
trace_hardirq_exit(); \
rcu_nmi_exit(); \
BUG_ON(!in_nmi()); \
- preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET); \
+ __preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET); \
ftrace_nmi_exit(); \
lockdep_on(); \
printk_nmi_exit(); \