Re: [Patch v8 10/23] perf/x86: Enable XMM Register Sampling for Non-PEBS Events

From: Mi, Dapeng

Date: Sun May 31 2026 - 23:04:30 EST



On 5/29/2026 7:38 PM, Peter Zijlstra wrote:
> On Fri, May 29, 2026 at 03:56:32PM +0800, Dapeng Mi wrote:
>> Previously, XMM register sampling was only available for PEBS events
>> starting from Icelake. Currently the support is now extended to non-PEBS
>> events by utilizing the xsaves instruction, thereby completing the
>> feature set.
>>
>> To implement this, a 64-byte aligned buffer is required. A per-CPU
>> ext_regs_buf is introduced to store SIMD and other registers, with an
>> approximate size of 2K. The buffer is allocated using kzalloc_node(),
>> ensuring natural and 64-byte alignment for all kmalloc() allocations
>> with powers of 2.
>>
>> XMM sampling for non-PEBS events is supported in the REGS_INTR case.
>> Support for REGS_USER will be added in a subsequent patch. For PEBS
>> events, XMM register sampling data is directly retrieved from PEBS
>> records.
>>
>> Future support for additional vector registers (YMM/ZMM/OPMASK) is
>> planned. An `ext_regs_mask` is added to track the supported vector
>> register groups.
>>
>> Co-developed-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@xxxxxxxxxxxxxxx>
> I suspect Sashiko's last point is valid and using XMM sampling on older
> PEBS will not do the right thing.
>
> Creating PEBS events with XMM reg sampling should fail if the hardware
> doesn't support it. That said, I could easily have missed a check for
> this, this code is a bit of a maze :/

Hmm, yeah. Currently the SIMD/eGPRs/SSP sampling is designed to support for
non-PEBS and PEBS events and they shared "intr-regs" and "user-regs"
options in perf tools, so the capability PERF_PMU_CAP_EXTENDED_REGS is set
as long as either non-PEBS or PEBS can support it. 

This indeed cause some inconsistency and mess. I would strengthen the check
and only set the flag when both PEBS and non-PEBS simultaneously support it.

Thanks.


>