Re: [Patch v8 10/23] perf/x86: Enable XMM Register Sampling for Non-PEBS Events

From: Peter Zijlstra

Date: Fri May 29 2026 - 07:40:34 EST


On Fri, May 29, 2026 at 03:56:32PM +0800, Dapeng Mi wrote:
> Previously, XMM register sampling was only available for PEBS events
> starting from Icelake. Currently the support is now extended to non-PEBS
> events by utilizing the xsaves instruction, thereby completing the
> feature set.
>
> To implement this, a 64-byte aligned buffer is required. A per-CPU
> ext_regs_buf is introduced to store SIMD and other registers, with an
> approximate size of 2K. The buffer is allocated using kzalloc_node(),
> ensuring natural and 64-byte alignment for all kmalloc() allocations
> with powers of 2.
>
> XMM sampling for non-PEBS events is supported in the REGS_INTR case.
> Support for REGS_USER will be added in a subsequent patch. For PEBS
> events, XMM register sampling data is directly retrieved from PEBS
> records.
>
> Future support for additional vector registers (YMM/ZMM/OPMASK) is
> planned. An `ext_regs_mask` is added to track the supported vector
> register groups.
>
> Co-developed-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> Signed-off-by: Dapeng Mi <dapeng1.mi@xxxxxxxxxxxxxxx>

I suspect Sashiko's last point is valid and using XMM sampling on older
PEBS will not do the right thing.

Creating PEBS events with XMM reg sampling should fail if the hardware
doesn't support it. That said, I could easily have missed a check for
this, this code is a bit of a maze :/