[PATCH v1] perf/x86: Fix potential bad container_of in intel_pmu_hw_config

From: Ian Rogers

Date: Thu Mar 12 2026 - 15:44:30 EST


Auto counter reload may have a group of events with software events
present within it. The software event PMU isn't the x86_hybrid_pmu and
a container_of operation in intel_pmu_set_acr_caused_constr (via the
hybrid helper) could cause out of bound memory reads. Avoid this by
guarding the call to intel_pmu_set_acr_caused_constr with an
is_x86_event check.

Fixes: ec980e4facef ("perf/x86/intel: Support auto counter reload")
Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
---
This fix was prompted by failure to get a BUG_ON in this series:
https://lore.kernel.org/lkml/a61eae6d-7a6d-40bd-83ec-bd4ea7657b9d@xxxxxxxxxxxxxxx/
and so I ran an AI analysis to see if there were similar bad casts as
spotted in:
https://lore.kernel.org/lkml/20260311075201.2951073-2-dapeng1.mi@xxxxxxxxxxxxxxx/
The AI analysis found this issue and its much more verbose
description is below:

I have evaluated all callers of the hybrid_pmu function within the
arch/x86/events directory. The vast majority of usages are safe
because they operate on an event that is currently being initialized
by the x86 PMU subsystem, which guarantees that event->pmu is
inherently an x86 PMU.

However, there is a distinct bug where a non-x86 PMU (e.g., a software
PMU) can be inadvertently passed into the hybrid_pmu function.

This issue occurs during group event validation and configuration in
intel_pmu_hw_config.

The Vulnerability Flow

1. The Context:

Inside intel_pmu_hw_config (located in arch/x86/events/intel/core.c),
there is logic to handle Automatic Counter Reload (ACR) capabilities
for event groups. The code needs to identify siblings that cause other
events to reload and apply constraints to them.

2. The Missing Check:

In the first pass through the sibling events, the code correctly
checks if each sibling is an x86 event via !is_x86_event(sibling)
before building a cause_mask. However, in the second pass to apply
the constraints, the code iterates over all siblings again but omits
the is_x86_event(sibling) check: arch/x86/events/intel/core.c#n4847
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/intel/core.c#n4847)

1 if (leader->nr_siblings) {
2 for_each_sibling_event(sibling, leader)
3 intel_pmu_set_acr_caused_constr(sibling, idx++, cause_mask); // <-- Missing is_x86_event() check!
4 }

3. The Invalid Cast:

The intel_pmu_set_acr_caused_constr function takes this sibling event
(which could be a software event) and executes the hybrid macro over
its pmu pointer: arch/x86/events/intel/core.c#n4624
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/intel/core.c#n4624)

1 static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
2 int idx, u64 cause_mask)
3 {
4 if (test_bit(idx, (unsigned long *)&cause_mask))
5 event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
6 }

4. The Root Cause:

The hybrid macro expands and passes the event->pmu to hybrid_pmu:
arch/x86/events/perf_event.h#n788
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/perf_event.h#n788)

1 #define hybrid(_pmu, _field) \
2 ...
3 if (is_hybrid() && (_pmu)) \
4 __Fp = &hybrid_pmu(_pmu)->_field; \

Which subsequently results in a blind container_of on a non-x86 PMU
pointer: arch/x86/events/perf_event.h#n780
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/perf_event.h#n780)

1 static __always_inline struct x86_hybrid_pmu *hybrid_pmu(struct pmu *pmu)
2 {
3 return container_of(pmu, struct x86_hybrid_pmu, pmu);
4 }

Conclusion

If a user creates an event group led by an x86 ACR event but includes
a non-x86 sibling event (like a software event), the second traversal
in intel_pmu_hw_config will blindly pass the software PMU to
hybrid_pmu. Because container_of assumes the PMU is embedded inside an
x86_hybrid_pmu struct, the resulting pointer becomes invalid, leading
to memory corruption or an out-of-bounds access when attempting to
read the acr_cause_mask64 property.
---
arch/x86/events/intel/core.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index cf3a4fe06ff2..26e829b8a882 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4844,8 +4844,10 @@ static int intel_pmu_hw_config(struct perf_event *event)
intel_pmu_set_acr_caused_constr(leader, idx++, cause_mask);

if (leader->nr_siblings) {
- for_each_sibling_event(sibling, leader)
- intel_pmu_set_acr_caused_constr(sibling, idx++, cause_mask);
+ for_each_sibling_event(sibling, leader) {
+ if (is_x86_event(sibling))
+ intel_pmu_set_acr_caused_constr(sibling, idx++, cause_mask);
+ }
}

if (leader != event)
--
2.53.0.851.ga537e3e6e9-goog