Re: [PATCH v1 1/2] perf/x86: Avoid inadvertent casts to x86_hybrid_pmu

From: Ian Rogers

Date: Thu Mar 12 2026 - 11:21:31 EST


On Thu, Mar 12, 2026 at 2:44 AM Mi, Dapeng <dapeng1.mi@xxxxxxxxxxxxxxx> wrote:
>
>
> On 3/12/2026 4:31 PM, Peter Zijlstra wrote:
> > On Wed, Mar 11, 2026 at 10:48:09PM -0700, Ian Rogers wrote:
> >> The patch:
> >> https://lore.kernel.org/lkml/20260311075201.2951073-2-dapeng1.mi@xxxxxxxxxxxxxxx/
> >> showed it was pretty easy to accidentally cast non-x86 PMUs to
> >> x86_hybrid_pmus. Add a BUG_ON for that case. Restructure is_x86_event
> >> and add an is_x86_pmu to facilitate this.
> >>
> >> @@ -779,6 +795,7 @@ struct x86_hybrid_pmu {
> >>
> >> static __always_inline struct x86_hybrid_pmu *hybrid_pmu(struct pmu *pmu)
> >> {
> >> + BUG_ON(!is_x86_pmu(pmu));
> >> return container_of(pmu, struct x86_hybrid_pmu, pmu);
> >> }
> > Given that hybrid_pmu will have PERF_PMU_CAP_EXTENDED_HW_TYPE, and we
> > should really only use hyrid_pmu() on one of those, would not the
> > simpler patch be so?
> >
> >
> > diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
> > index fad87d3c8b2c..13ec623617a9 100644
> > --- a/arch/x86/events/perf_event.h
> > +++ b/arch/x86/events/perf_event.h
> > @@ -779,6 +779,7 @@ struct x86_hybrid_pmu {
> >
> > static __always_inline struct x86_hybrid_pmu *hybrid_pmu(struct pmu *pmu)
> > {
> > + BUG_ON(!(pmu->capabilities & PERF_PMU_CAP_EXTENDED_HW_TYPE));
>
> It looks we can't add either !is_x86_pmu(pmu) or !(pmu->capabilities &
> PERF_PMU_CAP_EXTENDED_HW_TYPE) here. hybrid_pmu() is called by the hybrid()
> marco or other variants, and hybrid() macro is called in many places of the
> intel_pmu_init(), like the update_pmu_cap() , but the flag
> PERF_PMU_CAP_EXTENDED_HW_TYPE is still not set for the hybrid
> pmu->capabilities until intel_pmu_init() ends and the hybrid pmus are
> registered. Then it would cause the unexpected kernel crash.
>
> [ 1.945128] kernel BUG at arch/x86/events/intel/../perf_event.h:798!
> [ 1.946131] Oops: invalid opcode: 0000 [#1] SMP NOPTI
> [ 1.947127] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted
> 7.0.0-rc3-perf-urgent-gc8b4b538960c #460 PREEMPT(full)
> [ 1.947127] Hardware name: Intel Corporation Panther Lake Client
> Platform/PTL-UH LP5 T3 RVP1, BIOS PTLPFWI1.R00.3171.D00.2504220409 04/22/2025
> [ 1.947127] RIP: 0010:intel_pmu_init+0x25c9/0x5fd0
> [ 1.947127] Code: db 44 ff 4c 89 35 c7 da 44 ff 48 89 2d 80 da 44 ff e9
> 49 df ff ff 83 7a 68 04 0f 84 1b f9 ff ff f6 42 6d 01 0f 85 11 f9 ff ff
> <0f> 0b 31 d2 48 89 df
> [ 1.947127] RSP: 0000:ffffd5dc800f7db8 EFLAGS: 00010246
> [ 1.947127] RAX: 0000000000000001 RBX: 00000000000abfff RCX:
> 0000000000000000
> [ 1.947127] RDX: ffff8f40856bc000 RSI: 0000000000000001 RDI:
> 00000000000000ff
> [ 1.947127] RBP: 0000000000000001 R08: ffffffffffffffff R09:
> 0000000000000004
> [ 1.947127] R10: ffffffffbd4e2500 R11: 0000000000000006 R12:
> ffffffffbc26438b
> [ 1.947127] R13: 0000000000000000 R14: 0000000000000000 R15:
> 0000000000000000
> [ 1.947127] FS: 0000000000000000(0000) GS:ffff8f482214f000(0000)
> knlGS:0000000000000000
> [ 1.947127] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1.947127] CR2: ffff8f47ff7ff000 CR3: 00000004c1434001 CR4:
> 0000000000f70ef0
> [ 1.947127] PKRU: 55555554
> [ 1.947127] Call Trace:
> [ 1.947127] <TASK>
> [ 1.947127] ? __pfx_init_hw_perf_events+0x10/0x10
> [ 1.947127] init_hw_perf_events+0x2af/0x4b0
> [ 1.947127] ? __pfx_init_hw_perf_events+0x10/0x10
> [ 1.947127] do_one_initcall+0x52/0x250
> [ 1.947127] ? _raw_spin_unlock+0x18/0x40
> [ 1.947127] ? __register_sysctl_table+0x143/0x1a0
> [ 1.947127] kernel_init_freeable+0x21d/0x340
> [ 1.947127] ? __pfx_kernel_init+0x10/0x10
> [ 1.947127] kernel_init+0x1a/0x1c0
> [ 1.947127] ret_from_fork+0xcb/0x1c0
> [ 1.947127] ? __pfx_kernel_init+0x10/0x10
> [ 1.947127] ret_from_fork_asm+0x1a/0x30
> [ 1.947127] </TASK>
> [ 1.947127] Modules linked in:
> [ 1.947127] ---[ end trace 0000000000000000 ]---
> [ 1.948128] RIP: 0010:intel_pmu_init+0x25c9/0x5fd0
> [ 1.949128] Code: db 44 ff 4c 89 35 c7 da 44 ff 48 89 2d 80 da 44 ff e9
> 49 df ff ff 83 7a 68 04 0f 84 1b f9 ff ff f6 42 6d 01 0f 85 11 f9 ff ff
> <0f> 0b 31 d2 48 89 df
> [ 1.950129] RSP: 0000:ffffd5dc800f7db8 EFLAGS: 00010246
> [ 1.951128] RAX: 0000000000000001 RBX: 00000000000abfff RCX:
> 0000000000000000
> [ 1.952128] RDX: ffff8f40856bc000 RSI: 0000000000000001 RDI:
> 00000000000000ff
> [ 1.953128] RBP: 0000000000000001 R08: ffffffffffffffff R09:
> 0000000000000004
> [ 1.954129] R10: ffffffffbd4e2500 R11: 0000000000000006 R12:
> ffffffffbc26438b
> [ 1.955128] R13: 0000000000000000 R14: 0000000000000000 R15:
> 0000000000000000
> [ 1.956128] FS: 0000000000000000(0000) GS:ffff8f482214f000(0000)
> knlGS:0000000000000000
> [ 1.957128] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1.958128] CR2: ffff8f47ff7ff000 CR3: 00000004c1434001 CR4:
> 0000000000f70ef0
> [ 1.959128] PKRU: 55555554
> [ 1.960128] Kernel panic - not syncing: Attempted to kill init!
> exitcode=0x0000000b
>
> I'm not sure if we can move the flag PERF_PMU_CAP_EXTENDED_HW_TYPE setting
> earlier and eventually find a good place to set the flag. Even it's
> possible, but could be risky ...
>
> Ian, if you don't object, I would suggest to drop the bug_on(). I would
> adopt other changes and add the is_x86_pmu() check in the
> x86_pmu_has_rdpmc_user_disable() to fix the issue.

No objections from me, these patches aim to improve the code's typing
and were intended more as a suggestion easily expressed in code than
by email :-)

I feel that the x86_pmu and x86_hybrid_pmu are something of a mess,
making features like counter partitioning harder than necessary if the
code were structured better - ie 1 partition per PMU, which means
multiple PMUs even without hybrid. It'd be nice if we could implement
something like the BUG_ON to at least ensure correctness in the
current code. I'll try to find other broken casts/container_ofs in the
code by visual inspection.

Thanks,
Ian

> Thanks.
>
> > return container_of(pmu, struct x86_hybrid_pmu, pmu);
> > }
> >