Re: [PATCH 1/2] tracing: Add a trace for task_exit

From: Peter.Enderborg
Date: Tue May 04 2021 - 04:02:01 EST


On 5/3/21 10:55 PM, Steven Rostedt wrote:
> On Mon, 03 May 2021 14:02:48 -0500
> ebiederm@xxxxxxxxxxxx (Eric W. Biederman) wrote:
>
>>> However current traces is template based, and I assume it wont be
>>> popular to add new fields to the template, and exit reasons is not
>>> right for the other template use cases.
> trace events can always add fields, it's removing them that can cause
> problems (but even then, it's not that bad). The new libtracefs and
> libtraceevent make it trivial for tools to get the fields from trace events
> when needed.
>
>>> I still see a "new" task moving it to do_exit make trace name more
>>> correct?  Or is trace_task_do_exit better?
> It is also trivial with the libraries to write a tool that can put together
> everything you want. We even are working on python bindings to connect to
> these libraries where you could write a python script to do this.

The bpftrace package are 163 MB install size and that is on
a system that already have python. Linux is very much used
on embedded system, having a shell is luxurious.


Trivial?

Concept

A eBpf program hook in to a tracepoint A & B and collect data.

A happens before B and send the collected data when B happen.

1 A is called:

2 C is created, C is destroyed.

3 B is called. How do I fetch C?

However I can make a ebpf that hooks in sched_process_free and
sched_process_exit use  the uapi version of bpf_get_current_task to pick up

oom_score_adj and exit_code.  However task definition is dependent on 71 ifdef's
not including object that is pointers that also might have build dependency
and some are there more than once.

I think kprobe will cause the same problem. It wont be that big deal if it
was a for kernel debugging. But this is for userspace and should not
have dependency on kernel internals.


> There is no need for a new tracepoint, especially if it makes it harder to
> improve the implementation of what is being traced.
It does not introduce any complex functionality, and with a other
mechanism i still believe you would need to reap the task somewhere.
But I guess it will be needed to add a exist status flag that is new,
but that is with or without a new tracepoint.

The python libs that uses this fetch the first item in the task_struct
and assume that it is thread_info. What could possible go wrong?

Is there a runtime linker in ebpf that resolves this by magic?

>
>> I really can't say, as I don't know much of anything about the tracing
>> infrastructure. I would assume in most cases with a tracepoint in place
>> other kinds of tracing (like bpf programs) could come into play and read
>> out pieces of information that are not commonly wanted.
>>
>> All I really know something about is the exit code path, as I keep
>> slowly trying to clean it up. I plan on ignoring any tracepoint that
>> makes that gets in the way.
> As you should.
>
> -- Steve