[PATCH v2 0/6] Corrections to cpu map event encoding

From: Ian Rogers
Date: Tue Jun 14 2022 - 10:35:35 EST


A mask encoding of a cpu map is laid out as:
u16 nr
u16 long_size
unsigned long mask[];
However, the mask may be 8-byte aligned meaning there is a 4-byte pad
after long_size. This means 32-bit and 64-bit builds see the mask as
being at different offsets. On top of this the structure is in the byte
data[] encoded as:
u16 type
char data[]
This means the mask's struct isn't the required 4 or 8 byte aligned, but
is offset by 2. Consequently the long reads and writes are causing
undefined behavior as the alignment is broken.

These changes do minor clean up with const, visibility of functions
and using the constant time max function. It then adds 32 and 64-bit
mask encoding variants, packed to match current alignment. Taking the
address of a packed struct leads to unaligned data, so function
arguments are altered to be passed the packed struct. To compact the
mask encoding further and drop the padding, the 4-byte variant is
preferred. Finally a new range encoding is added, that reduces the
size of the common case of a range of CPUs to a single u64.

On a 72 CPU (hyperthread) machine the original encoding of all CPUs is:
0x9a98 [0x28]: event: 74
.
. ... raw event: size 40 bytes
. 0000: 4a 00 00 00 00 00 28 00 01 00 02 00 08 00 00 00 J.....(.........
. 0010: 00 00 ff ff ff ff ff ff ff ff ff 00 00 00 00 00 ................
. 0020: 00 00 00 00 00 00 00 00 ........

0 0 0x9a98 [0x28]: PERF_RECORD_CPU_MAP

Using the 4-byte encoding it is:
0x9a98@pipe [0x20]: event: 74
.
. ... raw event: size 32 bytes
. 0000: 4a 00 00 00 00 00 20 00 01 00 03 00 04 00 ff ff J..... .........
. 0010: ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 00 ................

0 0 0x9a98 [0x20]: PERF_RECORD_CPU_MAP

Finally, with the range encoding it is:
0x9ab8@pipe [0x10]: event: 74
.
. ... raw event: size 16 bytes
. 0000: 4a 00 00 00 00 00 10 00 02 00 00 00 00 00 47 00 J.............G.

0 0 0x9ab8 [0x10]: PERF_RECORD_CPU_MAP

v2. Fixes a bug in the size computation of the update header
introduced by the last patch (Add range data encoding) and caught
by address sanitizer.

Ian Rogers (6):
perf cpumap: Const map for max
perf cpumap: Synthetic events and const/static
perf cpumap: Compute mask size in constant time
perf cpumap: Fix alignment for masks in event encoding
perf events: Prefer union over variable length array
perf cpumap: Add range data encoding

tools/lib/perf/cpumap.c | 2 +-
tools/lib/perf/include/perf/cpumap.h | 2 +-
tools/lib/perf/include/perf/event.h | 61 ++++++++-
tools/perf/tests/cpumap.c | 71 ++++++++---
tools/perf/tests/event_update.c | 14 +--
tools/perf/util/cpumap.c | 111 +++++++++++++---
tools/perf/util/cpumap.h | 4 +-
tools/perf/util/event.h | 4 -
tools/perf/util/header.c | 24 ++--
tools/perf/util/session.c | 35 +++---
tools/perf/util/synthetic-events.c | 182 +++++++++++++--------------
tools/perf/util/synthetic-events.h | 2 +-
12 files changed, 327 insertions(+), 185 deletions(-)

--
2.36.1.476.g0c4daa206d-goog