Re: [RFC] perf/sdt: Directly record SDT event with 'perf record'

From: Ingo Molnar
Date: Mon Feb 20 2017 - 03:51:51 EST



* Ravi Bangoria <ravi.bangoria@xxxxxxxxxxxxxxxxxx> wrote:

> Yes, initially I thought about allowing both, 'perf probe' and
> 'perf record' for SDT event. But there are few complications with
> it, esp. when multiple SDT events with same name exists. For ex,
>
> $ readelf -n /usr/lib64/libpthread-2.24.so | grep -A2 Provider
> Provider: libpthread
> Name: mutex_entry
> Location: 0x0000000000009ddb, ...
> --
> Provider: libpthread
> Name: mutex_entry
> Location: 0x000000000000bcbb, ...
>
> At the time of record, perf has to check first if there is any
> matching entry exists in uprobe_events with that name. If found,
> record it, if not, go look into probe cache. If events exists with
> same name in probe cache, record all of them. Like,
>
> If probe point _is not_ created,
> $ perf record -a -e sdt_libpthread:mutex_entry
> /** Record both sdt_libpthread:mutex_entry **/
>
> If probe point _is_ created manually, record that particular event,
> $ perf probe -x /usr/lib64/libpthread-2.24.so sdt_libpthread:mutex_entry
> Added new events:
> sdt_libpthread:mutex_entry (on %mutex_entry in /usr/lib64/libpthread-2.24.so)
> sdt_libpthread:mutex_entry_1 (on %mutex_entry in /usr/lib64/libpthread-2.24.so)
>
> $ perf record -a -e sdt_libpthread:mutex_entry
> /** Record only first sdt_libpthread:mutex_entry **/
>
> Here, same command gives different behaviour for different scenarios.
>
> Now consider a scenario when probe point exists for any one event:
>
> $ perf probe -d sdt_libpthread:mutex_entry_1
> $ perf probe --list
> sdt_libpthread:mutex_entry (on pthread_mutex_lock+11 in /usr/lib64/libpthread-2.24.so)
>
> And user tries to record it by,
> $ perf record -a -e sdt_libpthread:*
>
> What should be the behavior of the tool? Should it record only one
> 'sdt_libpthread:mutex_entry' which exists in uprobe_events? Or it
> should record all the SDT events from libpthread? We can choose either
> of two but both the cases are ambiguous.

They are not ambiguous really if coded right: just pick one of the outcomes and
maybe print a warning to inform the user that something weird is going on because
not all markers are enabled?

As a user I'd expect 'perf record' to enable all markers and print a warning that
the markers were in a partial state. This would result in consistent behaviour.

Does it make sense to only enable some of the markers that alias on the same name?
If not then maybe disallow that in perf probe - or change perf probe to do the
same thing as perf record.

I.e. this is IMHO an artificial problem that users should not be exposed to and
which can be solved by tooling.

In particular if it's possible to enable only a part of the markers then perf
record not continuing would be a failure mode: if for example a previous perf
record session segfaulted (or ran out of RAM or was killed in the wrong moment or
whatever) then it would not be possible to (easily) clean up the mess.

> Not allowing 'perf probe' for SDT event will solve all such issues.
> Also it will make user interface simple and consistent. Other current
> tooling (systemtap, for instance) also do not allow probing individual
> markers when there are multiple markers with the same name.

In any case if others agree with your change in UI flow too then it's fine by me,
but please make it robust, i.e. if perf record sees partially enabled probes it
should still continue.

Thanks,

Ingo