[PATCH v10 00/15] perf: support enable and disable commands in stat and record modes

From: Alexey Budankov
Date: Wed Jul 08 2020 - 03:38:38 EST



Changes in v10:
- reused struct priv, renamed props into flags and used unsigned int type as a storage
- implemented and adopted usage of evlist__ctlfd_initialized() function
- updated while condition with checking !(done || stop)
- implemented compute_tts() to calculate the remainder of timeout to wait in poll()

v9: https://lore.kernel.org/lkml/a4d5db4a-f25c-38dc-1c41-321a886cb122@xxxxxxxxxxxxxxx/

Changes in v9:
- avoided screwing of fds by fdarray__filter() to fix staleness of pos returned by fdarray__add()
- implemented notion of properties for fds kept by fdarray object and introduced nonfilterable
property for control fd
- avoided counting of nonfilterable fds by fdarray__filter()
- replaced strlen() with sizeof(str)-1 for CMD_TAGS
- avoided mixing timeout and interval processing in one function
- renamed child to child_exited variable
- processed time_to_sleep < time_diff condition
- placed evlist__ctlfd_process() call after evlist__poll() and evlist__filter_pollfd()
calls in record mode

v8: https://lore.kernel.org/lkml/0781a077-aa82-5b4a-273e-c17372a72b93@xxxxxxxxxxxxxxx/

Changes in v8:
- avoided moving of fds at fdarray__filter() call
- skipped counting of fds with zeroed revents at fdarray__filter() call
- converted explicit --ctl-fd[-ack] into --control fd:ctl-fd[,ack-fd option
- updated docs to accommodate --control fd:ctl-fd[,ack-fd] option

v7: https://lore.kernel.org/lkml/5de4b954-24f0-1e8d-5a0d-7b12783b8218@xxxxxxxxxxxxxxx/

Changes in v7:
- added missing perf-record.txt changes
- adjusted docs wording for --ctl-fd,ctl-fd-ack options
to additionally mention --delay=-1 effect

v6: https://lore.kernel.org/lkml/f8e3a714-d9b1-4647-e1d2-9981cbaa83ec@xxxxxxxxxxxxxxx/

Changes in v6:
- split re-factoring of events handling loops for stat mode
into smaller incremental parts
- added parts missing at v5
- corrected v5 runtime issues

v5: https://lore.kernel.org/lkml/e5cac8dd-7aa4-ec7c-671c-07756907acba@xxxxxxxxxxxxxxx/

Changes in v5:
- split re-factoring of events handling loops for stat mode
into smaller incremental parts

v4: https://lore.kernel.org/lkml/653fe5f3-c986-a841-1ed8-0a7d2fa24c00@xxxxxxxxxxxxxxx/

Changes in v4:
- made checking of ctlfd state unconditional in record trace streaming loop
- introduced static poll fds to keep evlist__filter_pollfd() unaffected
- handled ret code of evlist__initialize_ctlfd() where need
- renamed and structured handle_events() function
- applied anonymous structs where needed

v3: https://lore.kernel.org/lkml/eb38e9e5-754f-d410-1d9b-e26b702d51b7@xxxxxxxxxxxxxxx/

Changes in v3:
- renamed functions and types from perf_evlist_ to evlist_ to avoid
clash with libperf code;
- extended commands to be strings of variable length consisting of
command name and also possibly including command specific data;
- merged docs update with the code changes;
- updated docs for -D,--delay=-1 option for stat and record modes;

v2: https://lore.kernel.org/lkml/d582cc3d-2302-c7e2-70d3-bc7ab6f628c3@xxxxxxxxxxxxxxx/

Changes in v2:
- renamed resume and pause commands to enable and disable ones, renamed
CTL_CMD_RESUME and CTL_CMD_PAUSE to CTL_CMD_ENABLE and CTL_CMD_DISABLE
to fit to the appropriate ioctls and avoid mixing up with PAUSE_OUTPUT
ioctl;
- factored out event handling loop into a handle_events() for stat mode;
- separated -D,--delay=-1 into separate patches for stat and record modes;

v1: https://lore.kernel.org/lkml/825a5132-b58d-c0b6-b050-5a6040386ec7@xxxxxxxxxxxxxxx/

repo: tip of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core

The patch set implements handling of 'start disabled', 'enable' and 'disable'
external control commands which can be provided for stat and record modes
of the tool from an external controlling process. 'start disabled' command
can be used to postpone enabling of events in the beginning of a monitoring
session. 'enable' and 'disable' commands can be used to enable and disable
events correspondingly any time after the start of the session.

The 'start disabled', 'enable' and 'disable' external control commands can be
used to focus measurement on specially selected time intervals of workload
execution. Focused measurement reduces tool intrusion and influence on
workload behavior, reduces distortion and amount of collected and stored
data, mitigates data accuracy loss because measurement and data capturing
happen only during intervals of interest.

A controlling process can be a bash shell script [1], native executable or
any other language program that can directly work with file descriptors,
e.g. pipes [2], and spawn a process, specially the tool one.

-D,--delay <val> option is extended with -1 value to skip events enabling
in the beginning of a monitoring session ('start disabled' command).
--control fd:ctl-fd[,ack-fd] command line option is introduced to provide the
tool with a pair of file descriptors to listen to control commands and reply
to the controlling process on the completion of received commands.

The tool reads control command message from ctl-fd descriptor, handles the
command and optionally replies acknowledgement message to ack-fd descriptor,
if it is specified on the command line. 'enable' command is recognized as
'enable' string message and 'disable' command is recognized as 'disable'
string message both received from ctl-fd descriptor. Completion message is
'ack\n' and sent to ack-fd descriptor.

Example bash script demonstrating simple use case follows:

#!/bin/bash

ctl_dir=/tmp/

ctl_fifo=${ctl_dir}perf_ctl.fifo
test -p ${ctl_fifo} && unlink ${ctl_fifo}
mkfifo ${ctl_fifo} && exec {ctl_fd}<>${ctl_fifo}

ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
mkfifo ${ctl_ack_fifo} && exec {ctl_fd_ack}<>${ctl_ack_fifo}

perf stat -D -1 -e cpu-cycles -a -I 1000 \
--control fd:${ctl_fd},${ctl_fd_ack} \
-- sleep 40 &
perf_pid=$!

sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e2 && echo "enabled(${e2})"
sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d2 && echo "disabled(${d2})"

exec {ctl_fd_ack}>&- && unlink ${ctl_ack_fifo}
exec {ctl_fd}>&- && unlink ${ctl_fifo}

wait -n ${perf_pid}
exit $?


Script output:

[root@host dir] example
Events disabled
# time counts unit events
1.001101062 <not counted> cpu-cycles
2.002994944 <not counted> cpu-cycles
3.004864340 <not counted> cpu-cycles
4.006727177 <not counted> cpu-cycles
Events enabled
enabled(ack)
4.993808464 3,124,246 cpu-cycles
5.008597004 3,325,624 cpu-cycles
6.010387483 83,472,992 cpu-cycles
7.012266598 55,877,621 cpu-cycles
8.014175695 97,892,729 cpu-cycles
9.016056093 68,461,242 cpu-cycles
10.017937507 55,449,643 cpu-cycles
11.019830154 68,938,167 cpu-cycles
12.021719952 55,164,101 cpu-cycles
13.023627550 70,535,720 cpu-cycles
14.025580995 53,240,125 cpu-cycles
disabled(ack)
14.997518260 53,558,068 cpu-cycles
Events disabled
15.027216416 <not counted> cpu-cycles
16.029052729 <not counted> cpu-cycles
17.030904762 <not counted> cpu-cycles
18.032073424 <not counted> cpu-cycles
19.033805074 <not counted> cpu-cycles
Events enabled
enabled(ack)
20.001279097 3,021,022 cpu-cycles
20.035044381 6,434,367 cpu-cycles
21.036923813 89,358,251 cpu-cycles
22.038825169 72,516,351 cpu-cycles
# time counts unit events
23.040715596 55,046,157 cpu-cycles
24.042643757 78,128,649 cpu-cycles
25.044558535 61,052,428 cpu-cycles
26.046452785 62,142,806 cpu-cycles
27.048353021 74,477,971 cpu-cycles
28.050241286 61,001,623 cpu-cycles
29.052149961 61,653,502 cpu-cycles
disabled(ack)
30.004980264 82,729,640 cpu-cycles
Events disabled
30.053516176 <not counted> cpu-cycles
31.055348366 <not counted> cpu-cycles
32.057202097 <not counted> cpu-cycles
33.059040702 <not counted> cpu-cycles
34.060843288 <not counted> cpu-cycles
35.000888624 <not counted> cpu-cycles
[root@host dir]#

[1] http://man7.org/linux/man-pages/man1/bash.1.html
[2] http://man7.org/linux/man-pages/man2/pipe.2.html

---
Alexey Budankov (15):
tools/libperf: avoid internal moving of fdarray fds
tools/libperf: add flags to fdarray fds objects
tools/libperf: avoid counting of nonfilterable fdarray fds
perf evlist: introduce control file descriptors
perf evlist: implement control command handling functions
perf stat: factor out body of event handling loop for system wide
perf stat: move target check to loop control statement
perf stat: factor out body of event handling loop for fork case
perf stat: factor out event handling loop into dispatch_events()
perf stat: extend -D,--delay option with -1 value
perf stat: implement control commands handling
perf stat: introduce --control fd:ctl-fd[,ack-fd] options
perf record: extend -D,--delay option with -1 value
perf record: implement control commands handling
perf record: introduce --control fd:ctl-fd[,ack-fd] options

tools/lib/api/fd/array.c | 17 ++-
tools/lib/api/fd/array.h | 16 +-
tools/lib/perf/evlist.c | 6 +-
tools/lib/perf/include/internal/evlist.h | 2 +-
tools/perf/Documentation/perf-record.txt | 44 +++++-
tools/perf/Documentation/perf-stat.txt | 44 +++++-
tools/perf/builtin-record.c | 65 +++++++-
tools/perf/builtin-stat.c | 183 ++++++++++++++++++-----
tools/perf/builtin-trace.c | 2 +-
tools/perf/tests/fdarray.c | 22 +--
tools/perf/util/evlist.c | 144 +++++++++++++++++-
tools/perf/util/evlist.h | 26 ++++
tools/perf/util/record.h | 4 +-
tools/perf/util/stat.h | 4 +-
14 files changed, 494 insertions(+), 85 deletions(-)

--
2.24.1