Re: [PATCH v1 1/2] powercap: intel_rapl: Prepare read_raw interface for atomic-context callers

From: Raag Jadav

Date: Sat Feb 14 2026 - 11:12:42 EST


On Sat, Feb 14, 2026 at 07:31:04AM -0800, Sathyanarayanan Kuppuswamy wrote:
> On 2/13/26 10:37 PM, Raag Jadav wrote:
> > On Thu, Nov 20, 2025 at 04:05:38PM -0800, Kuppuswamy Sathyanarayanan wrote:
> > > The current read_raw() implementation of the TPMI, MMIO and MSR
> > > interfaces does not distinguish between atomic and non-atomic callers.
> > >
> > > rapl_msr_read_raw() uses rdmsrq_safe_on_cpu(), which can sleep and
> > > issue cross CPU calls. When MSR-based RAPL PMU support is enabled, PMU
> > > event handlers can invoke this function from atomic context where
> > > sleeping or rescheduling is not allowed. In atomic context, the caller
> > > is already executing on the target CPU, so a direct rdmsrq() is
> > > sufficient.
> > >
> > > To support such usage, introduce an atomic flag to the read_raw()
> > > interface to allow callers pass the context information. Modify the
> > > common RAPL code to propagate this flag, and set the flag to reflect
> > > the calling contexts.
> > >
> > > Utilize the atomic flag in rapl_msr_read_raw() to perform direct MSR
> > > read with rdmsrq() when running in atomic context, and a sanity check
> > > to ensure target CPU matches the current CPU for such use cases.
> > >
> > > The TPMI and MMIO implementations do not require special atomic
> > > handling, so the flag is ignored in those paths.
> > >
> > > This is a preparatory patch for adding MSR-based RAPL PMU support.
> > ...
> >
> > > -static int rapl_msr_read_raw(int cpu, struct reg_action *ra)
> > > +static int rapl_msr_read_raw(int cpu, struct reg_action *ra, bool atomic)
> > > {
> > > + /*
> > > + * When called from atomic-context (eg PMU event handler)
> > > + * perform MSR read directly using rdmsrq().
> > > + */
> > > + if (atomic) {
> > > + if (unlikely(smp_processor_id() != cpu))
> > > + return -EIO;
> > This series breaks[1] our application[2] in cases where the reads are
> > issued from any available CPU it is scheduled on. This issue is not seen on
> > older platforms which use the original arch/x86 RAPL implementation.
> >
> > Can someone please shed some light on the change of userspace expectations?
> > Or did I miss any points in the documentation?
> >
> > [1] https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6935
> > [2] https://gitlab.freedesktop.org/drm/igt-gpu-tools/-/blob/master/lib/igt_power.c
>
> The access with non-lead CPUs is fixed by following series:
>
> https://lore.kernel.org/linux-pm/CAJZ5v0gh_3y4+2qepC5Mqos+y+kBfGgeEKdmL5s6J4MBGcrQzw@xxxxxxxxxxxxxx/T/#mabe68b0d5c3e5571c9333ff915d38562ec7fed71
>
> Can you please re-test with this above series?

Working now, thanks for the fix :)

Raag

> > > + rdmsrq(ra->reg.msr, ra->value);
> > > + goto out;
> > > + }
> > > +
> > > if (rdmsrq_safe_on_cpu(cpu, ra->reg.msr, &ra->value)) {
> > > pr_debug("failed to read msr 0x%x on cpu %d\n", ra->reg.msr, cpu);
> > > return -EIO;
> > > }
> > > +
> > > +out:
> > > ra->value &= ra->mask;
> > > return 0;
> > > }
>
> --
> Sathyanarayanan Kuppuswamy
> Linux Kernel Developer
>