[RFC][PATCH] tracing: Add trace_printk_ptr() to force non use of trace_bprintk()

From: Steven Rostedt
Date: Tue Jun 28 2016 - 19:19:36 EST


trace_printk() is a very helpful tool for debugging the kernel. It adds
lots of tricks to optimize itself to prevent any "heisenbugs". That is,
having the addition of tracing cause the bug to change its timing and
disappear. One of this tricks is to use trace_bprintk() when possible,
which just stores the format and the arguments into the ring buffer to
be processed later at the time of reading the trace output.

The issue with this is that there's some printf() fields that can do
redirection. There's a list of "%p*" values that will dereference the
pointer saved in the buffer. This is an issue with trace_printk()
because the pointer could have been freed between the time the
trace_printk() was called and the time the buffer is read. This will
cause a bad pointer dereference.

The preferable fix is most likely to change bprintk() to recognize
these pointers and instead of saving the pointer in the buffer to be
processed later, it could do the conversion and save the value in the
buffer. But this added processing kills the whole point of bprintk()
from being fast and not doing any processing during the recording.
Perhaps it should simply warn and/or refuse to print.

The simpler solution is to add an alternate trace_printk() that always
uses the non optimized version that does the string processing at the
time of the record, and saves just the string to the ring buffer.

There's been many times that I myself wanted this version. So here it
is.

Signed-off-by: Steven Rostedt <rostedt@xxxxxxxxxxx>
---
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 94aa10ffe156..3223307718a8 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -623,6 +623,31 @@ do { \
trace_puts(fmt); \
} while (0)

+/**
+ * trace_printk_ptr - unoptimized version of trace_printk
+ * @fmt: the printf format for printing
+ *
+ * This is the same as trace_printk() above, but it does no optimization.
+ * It passes the @fmt and arguments as is to the __trace_printk() that
+ * will process the print at the time it is called, whereas __trace_printk(),
+ * will call the bprintk() version that will simply copy the format as
+ * is, along with arguments into the trace buffer, and it will created the
+ * printf() string and do the conversions on print.
+ *
+ * Use this function if any of the parameters are being derefenced.
+ * For example, ("mask=%pb\m", cpumask). It is possible that the cpumask
+ * may be freed between the time trace_printk() is called, and the time
+ * the ring buffer is read, and this can cause the function to fault.
+ * trace_printk_ptr() will do the processing to copy the cpumask into
+ * a string format that can be read any time later, regardless of the
+ * status of the cpumask variable.
+ */
+
+#define trace_printk_ptr(fmt, ...) \
+do { \
+ __trace_printk(_THIS_IP_, fmt, ##__VA_ARGS__); \
+} while (0)
+
#define do_trace_printk(fmt, args...) \
do { \
static const char *trace_printk_fmt __used \