[PATCH v12 0/2] tools/perf: Support PERF_SAMPLE_READ with inherit

From: Ben Gainey
Date: Tue Oct 01 2024 - 08:15:52 EST


This revision of this change splits out the tools/perf changes requested
by Namhyung Kim for my previous work to enable PERF_SAMPLE READ with inherit (see
https://lore.kernel.org/linux-perf-users/20240730084417.7693-1-ben.gainey@xxxxxxx/ )
as the kernel side changes have been picked up by Peter Zijlstra.

Changes since v11:
- Rebased onto perf-tools-next (b38c49d8296b9eee)

Changes since v10:
- Fixed a formatting nit

Changes since v9:
- Split out tools/perf patches only
- Fixed system-wide mode in `perf record` to not set the inherit bit.

Changes since v8:
- Rebase on v6.11-rc1

Changes since v7:
- Rebase on v6.10-rc3
- Respond to Peter Zijlstra's feedback:
- Renamed nr_pending to nr_no_switch_fast and merged in nr_inherit_read
which otherwise had overlapping use
- Updated some of the commit messages to provide better justifications
of usecase, behavioural changes and so on
- Cleanup perf_event_count/_cumulative
- Make it explicit that the sampling event decides whether or not the
per-thread value is given in read_format for PERF_RECORD_SAMPLE and
PERF_RECORD_READ; updated tools to account for this.

Changes since v6:
- Rebase on v6.10-rc2
- Make additional "perf test" tests succeed / skip based on kernel
version as per feedback from Namhyung.

Changes since v5:
- Rebase on v6.9
- Cleanup feedback from Namhyung Kim

Changes since v4:
- Rebase on v6.9-rc1
- Removed the dependency on inherit_stat that was previously assumed
necessary as per feedback from Namhyung Kim.
- Fixed an incorrect use of zfree instead of free in the tools leading
to an abort on tool shutdown.
- Additional test coverage improvements added to perf test.
- Cleaned up the remaining bit of irrelevant change missed between v3
and v4.

Changes since v3:
- Cleaned up perf test data changes incorrectly included into this
series from elsewhere.

Changes since v2:
- Rebase on v6.8
- Respond to James Clarke's feedback; fixup some typos and move some
repeated checks into a helper macro.
- Cleaned up checkpatch lints.
- Updated perf test; fixed evsel handling so that existing tests pass
and added new tests to cover the new behaviour.

Changes since v1:
- Rebase on v6.8-rc1
- Fixed value written into sample after child exists.
- Modified handling of switch-out so that context with these events
take the slow path, so that the per-event/per-thread PMU state is
correctly switched.
- Modified perf tools to support this mode of operation.

Ben Gainey (2):
tools/perf: Correctly calculate sample period for inherited
SAMPLE_READ values
tools/perf: Allow inherit + PERF_SAMPLE_READ when opening events

tools/lib/perf/evsel.c | 48 ++++++++++++++
tools/lib/perf/include/internal/evsel.h | 63 ++++++++++++++++++-
tools/perf/tests/attr/README | 2 +
tools/perf/tests/attr/test-record-C0 | 2 +
tools/perf/tests/attr/test-record-dummy-C0 | 2 +-
.../tests/attr/test-record-group-sampling | 3 +-
.../tests/attr/test-record-group-sampling1 | 51 +++++++++++++++
.../tests/attr/test-record-group-sampling2 | 61 ++++++++++++++++++
tools/perf/tests/attr/test-record-group2 | 1 +
...{test-record-group2 => test-record-group3} | 10 +--
tools/perf/util/evsel.c | 21 ++++++-
tools/perf/util/evsel.h | 1 +
tools/perf/util/session.c | 28 ++++++---
13 files changed, 273 insertions(+), 20 deletions(-)
create mode 100644 tools/perf/tests/attr/test-record-group-sampling1
create mode 100644 tools/perf/tests/attr/test-record-group-sampling2
copy tools/perf/tests/attr/{test-record-group2 => test-record-group3} (81%)

--
2.46.1