Re: [RFC] perf/sdt: Directly record SDT event with 'perf record'

From: Ravi Bangoria
Date: Mon Feb 20 2017 - 03:22:00 EST




On Monday 20 February 2017 12:38 PM, Ingo Molnar wrote:
> * Ravi Bangoria <ravi.bangoria@xxxxxxxxxxxxxxxxxx> wrote:
>
>> All events from 'perf list', except SDT events, can be directly recorded
>> with 'perf record'. But, the flow is little different for SDT events.
>> Probe point for SDT event needs to be created using 'perf probe' before
>> recording it using 'perf record'.
>>
>> As suggested by Ingo[1], it's better to make this process simple by
>> creating probe points automatically with 'perf record' for SDT events.
>>
>> This patch disables 'perf probe' on SDT events to simplify usage. It
>> enables recording SDT event only with 'perf record'.
>>
>> This removes all those 'multiple events with same name' issues by not
>> allowing manual probe creation to user. When there are multiple events
>> with same name, 'perf record' will record all of them (in line with
>> other tools supporting SDT (systemtap)).
>>
>> I know 'perf probe' for SDT events has already became interface and
>> people are using it. But, doing this change will make user interface very
>> easy and also it will make tool behaviour consistent. Also, it won't
>> require any changes in uprobe_events structure (suggested by Masami[2]).
> So I like the automatism you implemented for 'perf record', but why not keep the
> 'perf probe' flow as well, if people got used to it?
>
> It's not like computer software is bad at sorting apart and handling the two cases
> properly, right?

Thanks Ingo for the reply,

Yes, initially I thought about allowing both, 'perf probe' and
'perf record' for SDT event. But there are few complications with
it, esp. when multiple SDT events with same name exists. For ex,

$ readelf -n /usr/lib64/libpthread-2.24.so | grep -A2 Provider
Provider: libpthread
Name: mutex_entry
Location: 0x0000000000009ddb, ...
--
Provider: libpthread
Name: mutex_entry
Location: 0x000000000000bcbb, ...

At the time of record, perf has to check first if there is any
matching entry exists in uprobe_events with that name. If found,
record it, if not, go look into probe cache. If events exists with
same name in probe cache, record all of them. Like,

If probe point _is not_ created,
$ perf record -a -e sdt_libpthread:mutex_entry
/** Record both sdt_libpthread:mutex_entry **/

If probe point _is_ created manually, record that particular event,
$ perf probe -x /usr/lib64/libpthread-2.24.so sdt_libpthread:mutex_entry
Added new events:
sdt_libpthread:mutex_entry (on %mutex_entry in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_entry_1 (on %mutex_entry in /usr/lib64/libpthread-2.24.so)

$ perf record -a -e sdt_libpthread:mutex_entry
/** Record only first sdt_libpthread:mutex_entry **/

Here, same command gives different behaviour for different scenarios.

Now consider a scenario when probe point exists for any one event:

$ perf probe -d sdt_libpthread:mutex_entry_1
$ perf probe --list
sdt_libpthread:mutex_entry (on pthread_mutex_lock+11 in /usr/lib64/libpthread-2.24.so)

And user tries to record it by,
$ perf record -a -e sdt_libpthread:*

What should be the behavior of the tool? Should it record only one
'sdt_libpthread:mutex_entry' which exists in uprobe_events? Or it
should record all the SDT events from libpthread? We can choose either
of two but both the cases are ambiguous.

Not allowing 'perf probe' for SDT event will solve all such issues.
Also it will make user interface simple and consistent. Other current
tooling (systemtap, for instance) also do not allow probing individual
markers when there are multiple markers with the same name.

-Ravi