Help with trace-cmd/ftrace recording process ID information
From: Will Hawkins
Date: Mon Jul 17 2017 - 15:18:25 EST
Hello everyone, especially Mr. Rostedt,
I have had great success with ftrace debugging performance issues on
Linux systems. The combination of ftrace and trace-cmd are absolutely
amazing tools for digging in to exactly what is going on in a system and
where performance problems exist.
I recently switched to a different host and attempted to run trace-cmd
record to get a record of page faults:
/path/to//trace-cmd/trace-cmd record -e page_fault_user /bin/ls
When I "report" on that trace, I get entries like the following:
<...>-41850 [010] 27484983.185200: page_fault_user:
address=__per_cpu_end ip=__per_cpu_end error_code=0x14
It's exactly what I want. However, it does not list the process that
generated that fault. Instead, it uses <...>. I dug into the trace-cmd
code and see where this is generated and why it is generated.
What I don't understand is why on a different system, when I run the
same record command, I get the following output:
ls-19887 [005] 2438162.263793: page_fault_user:
address=__per_cpu_end ip=__per_cpu_end error_code=0x14
Again, it's exactly what I want and it lists the process name that
generated the fault.
From the code, I see that the <...> is printed instead of the name of
the process when the pid is not in the pevent's command lines. What I
can't seem to figure out is why the process would be in that list on one
host and not on the other.
When I looked at the trace.dat file directly, I did notice that on the
"good" host, there are a list of pids and names. On the "bad" host,
there is no such list in the trace.dat file. I am sure that is the
reason for the <...>s being printed, but I can't figure out why that
list is not getting in the trace.dat file.
I gave a quick look to try to find where that pid/comm list is generated
and written to the trace.dat file, but couldn't find anything.
I figured that I would send an email before I dug any further in case
someone has seen this already. I am happy to pass along other pertinent
information if it is helpful to debug the problem. I just don't want to
spam the list with information that is irrelevant.
Again, the combination of ftrace/trace-cmd is borderline magic. I
appreciate all the work that has gone into it!
Thanks in advance for helping me sort through this issue!
Will