Re: [PATCH 11/11] libperf: Document code simplification case for widening struct perf_cpu
From: Ian Rogers
Date: Wed Jun 10 2026 - 13:25:17 EST
On Wed, Jun 10, 2026 at 9:53 AM Arnaldo Carvalho de Melo
<acme@xxxxxxxxxx> wrote:
>
> From: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
>
> Add a bullet point to the libperf ABI TODO explaining the code
> simplification benefit of widening struct perf_cpu.cpu from int16_t
> to int: the narrow type forces defensive truncation checks at every
> boundary where wider CPU indices are narrowed, and values > 32767
> silently wrap to negative numbers (two's complement), bypassing
> bounds validation without them.
>
> Cc: Ian Rogers <irogers@xxxxxxxxxx>
> Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
> Assisted-by: Claude Opus 4.6 <noreply@xxxxxxxxxxxxx>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Acked-by: Ian Rogers <irogers@xxxxxxxxxx>
The size motivation was for cases like having 1000 CPUs and a struct
perf_cpu_map with a CPU for each entry. In the data file we compress
this by tracking such cases as a range of CPUs. I wonder if the struct
perf_cpu_map abstraction should do something similar. It would save
memory, potentially be faster (iterating a range rather than loading
incrementally increasing values from an array) and then we wouldn't
care so much whether the struct perf_cpu was an int16_t or an int32_t.
Looking at lscpu, which does a similar job:
https://github.com/util-linux/util-linux/blob/master/sys-utils/lscpu.c
They use cpu_set_t but that seems to be a bitmap with fixed support
for around 1024 CPUs. So I prefer perf_cpu_map.
Thanks,
Ian
> ---
> tools/lib/perf/TODO | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/tools/lib/perf/TODO b/tools/lib/perf/TODO
> index 486dd95dc57208a8..e179728697d8c7c0 100644
> --- a/tools/lib/perf/TODO
> +++ b/tools/lib/perf/TODO
> @@ -11,6 +11,14 @@ together.
> (x86_64 max is 8192, arm64 is 4096), but NR_CPUS limits keep
> growing. perf clamps to INT16_MAX in set_max_cpu_num() as a
> safety net.
> + - Code simplification: the int16_t forces defensive truncation
> + checks at every boundary where a wider CPU index (int from
> + sample->cpu, al->cpu, etc.) is narrowed into struct perf_cpu.
> + Without these checks, values > 32767 silently wrap to negative
> + numbers (two's complement), bypassing bounds validation.
> + Widening to int eliminates this entire class of silent
> + truncation bugs and removes the need for the INT16_MAX clamp
> + in set_max_cpu_num().
> - Scope: struct perf_cpu is embedded everywhere — perf_cpu_map__cpu(),
> perf_cpu_map__min(), perf_cpu_map__max(), perf_cpu_map__has(), the
> for_each_cpu macros, and all internal callers. The perf_cpu_map
> --
> 2.54.0
>