BUG: perf script python SIGSEGV

From: Huttunen, Janne (Nokia - FI/Espoo)
Date: Wed Feb 05 2020 - 06:18:40 EST



It appears that the perf command crashes when processing
recording with a Python script and the first recorded event
in happens to be "context_tracking:user_enter" or
"context_tracking:user_exit".


$ perf script -g Python
generated Python script: perf-script.py
$ perf script -s perf-script.py -i perf.data
in trace_begin
No kallsyms or vmlinux with build-id 0cd1e61e3703521ee4de14a1f13aac17b762fe57 was found
Segmentation fault (core dumped)


#0 0x00007ffff58f46fa in strlen () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007ffff5535360 in PyString_FromString ()
from /lib64/libpython2.7.so.1.0
No symbol table info available.
#2 0x000000000059d52f in define_value (
field_type=field_type@entry=TEP_PRINT_FLAGS,
ev_name=ev_name@entry=0xd73320 <handler_name> "context_tracking__user_enter", field_name=0x0, field_value=field_value@entry=0x6ed556 "0",
field_str=0xd88ac0 "") at util/scripting-engines/trace-event-python.c:221
handler_name = <optimized out>
t = 0x7ffff7e849f0
value = 0
n = 1
#3 0x000000000059e5bd in define_event_symbols (event=event@entry=0xd7f890,
ev_name=ev_name@entry=0xd73320 <handler_name> "context_tracking__user_enter", args=<optimized out>) at util/scripting-engines/trace-event-python.c:282
No locals.
#4 0x000000000059f741 in python_process_tracepoint (al=0x7fffffffba20,
evsel=0xd88230, sample=0x7fffffffbce0)
at util/scripting-engines/trace-event-python.c:821
comm = 0x1040440 ":3682671"
handler_name = "context_tracking__user_enter", '\000' <repeats 227 times>
event = 0xd7f890
handler = <optimized out>
field = <optimized out>
ns = <optimized out>
n = 0
context = <optimized out>
obj = 0x0
callchain = <optimized out>
dict = 0x0
s = <optimized out>
pid = 3682671
data = 0x7fffed1442dc
default_handler_name = 0x6f5332 "trace_unhandled"
t = 0x4c6b8b <machine__resolve+395>
all_entries_dict = 0x0
cpu = 16
nsecs = 589187356866468
#5 python_process_event (event=<optimized out>, sample=0x7fffffffbce0,
evsel=0xd88230, al=0x7fffffffba20)
at util/scripting-engines/trace-event-python.c:1321
tables = 0xd73520 <tables_global>


The crash seems to happen because those events define the print
format like this:

print fmt: "%s", ""

As far as I can see, this generates only a single TEP_PRINT_ATOM
arg. When processing the event, define_value() tries to convert
cur_field_name to a python string. Since it hasn't seen any
TEP_PRINT_FIELD args, the cur_field_name is still NULL which
triggers the SIGSEGV.

A trivial workaround for the crash would be something like this:

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index 80ca5d0..8fd03d5 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -208,6 +208,9 @@ static void define_value(enum tep_print_arg_type field_type,
unsigned long long value;
unsigned n = 0;

+ if (field_name == NULL)
+ return;
+
if (field_type == TEP_PRINT_SYMBOL)
handler_name = "define_symbolic_value";


On the other hand, while perf doesn't crash anymore with that
patch, in a more complex case it would likely still do
something that would not make much sense. As far as I can see,
some old value could easily be left in the file scoped
cur_field_name variable. A more correct fix would require
a bit better understanding of the intended semantics, like
e.g. what exactly is the apparently quite unused
'zero_flag_atom' variable in there supposed to achieve?