Re: [RFC] adding of TGID to ftrace output

From: Joel Fernandes
Date: Wed May 10 2017 - 23:10:20 EST


Hi Steven,
Thanks for your quick reply.

On Wed, May 10, 2017 at 6:28 PM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> On Wed, 10 May 2017 16:04:55 -0700
> Joel Fernandes <joelaf@xxxxxxxxxx> wrote:
>
>> Hi Steven,
>>
>> Can we add TGID information along with PID to ftrace output?
>>
>> Something like:
>> # _-----=> irqs-off
>> # / _----=> need-resched
>> # | / _---=> hardirq/softirq
>> # || / _--=> preempt-depth
>> # ||| / delay
>> # TASK-PID:TGID CPU# |||| TIMESTAMP FUNCTION
>> # | | | | |||| | |
>> bash-1977:1977 [000] .... 17284.993652: sys_close <-
>>
>> Instead of:
>> # _-----=> irqs-off
>> # / _----=> need-resched
>> # | / _---=> hardirq/softirq
>> # || / _--=> preempt-depth
>> # ||| / delay
>> # TASK-PID CPU# |||| TIMESTAMP FUNCTION
>> # | | | |||| | |
>> bash-1977 [000] .... 17284.993652: sys_close <-
>
> Well the thing about this is that it adds more overhead to each event
> that is mostly not needed by users.

There will not be any overhead if we can retrieve the tgid at print
time and not trace time. I was thinking something like
trace_find_cmdline approach would work great. The only draw back of
something like this is if a thread is added and removed to a thread
group during a trace, then we might not have all tgid information
available. I think this is the draw back of the trace_find_cmdline
code as well where if a thread doesn't exist at trace output time,
then we end up printing "<...>" as the task comm? At the end of this
email, I make a suggestion to fix this.

>> Currently in android, we inject tgid into each trace event at the end
>> of the trace just so we can group threads under a process in our trace
>> viewer.
>>
>> The other option we're thinking off is we can retrieve tgid during the
>> trace output (reading trace file from debugfs/tracefs) from the
>> task_struct and then have ftrace output it that way.
>
> Hmm, is there any events that show the TGID currently? If we have that,
> you can use another tracing instance to read from (one that wont get
> overridden by other events), and keep track of what TGID processes
> have. The timestamp could be used to keep everything in sync.

There is a sched_process_fork event that can show ppid but not tgid. I
was thinking we modify that to show tgid as well, that could work.
However the draw back doing this is we can only see when something is
added or removed to the thread group during tracing. If something was
already a part of the thread group _before_ tracing, then we don't
have the information.

>> Anyway, having this will really simplify things and make it more
>> reliable (we run ps -T at the end of the trace right now, but its
>> unreliable because threads might have exited by then). We also have to
>> call getpid() and inject pid into each userspace trace_marker write
>> just so we can do this grouping.
>
> Perhaps we need to have a way to record it via another event. Fork or
> sched switch?

Same reasoning above, that separate events are not enough to have this
information as such events may not fire for the duration of the
tracing for the threads we're interested in.

> Perhaps we can add a trigger to record extra data like this, similar to
> the stack trace trigger does now.

Are you talking about 1 event trigger another event to have more
information? That would have the side-effect of having a separate
event for each event that we need the tgid information for, thus
causing lots of space wastage in the right buffer, right?

To fix the suggestion I made in the very beginning of this email, I
suggest a combination of trace_find_cmdline type of approach to save
the tgid of all current threads available, and then using
trace_sched_fork to find out what we missed (such as threads that were
added and removed.) I think this approach will be robust and not
affect any of the existing ftrace users (since we can make it a print
option). Thoughts?

Update: actually I find that someone already wrote an out-of-tree
patch for Android long time back during 3.10 kernel days that added a
'print-tgid' option that saved tgid in the trace_save_cmdline function
and then find it later during output time:
https://android.googlesource.com/kernel/msm.git/+/0c5363f6a0a89952b0d0072ff4e4c3bd042add9a%5E%21/
It will be nice if we can have a more upstream ready version of this
patch or some other approach that you think works best. Thanks!

Regards,
Joel