Re: perf: is it possible to userspace rdpmc but only on a certain core type

From: Peter Zijlstra
Date: Tue Jan 21 2025 - 07:52:44 EST


On Mon, Jan 20, 2025 at 11:44:37AM -0500, Liang, Kan wrote:
>
>
> On 2025-01-17 5:04 p.m., Vince Weaver wrote:
> > Hello
> >
> > so we've been working on PAPI support for Intel Top-Down events, which
> > let's say does "exciting" things involving the rdpmc instruction.
> >
> > One issue we are having is that on a hybrid machine (Raptor Lake in this
> > case with performance/efficiency cores) there is no top-down support
> > for the E-cores, and it will gpf/segfault if you try to rdpmc the top-down
> > events.
> >
> > Obviously PAPI would like to avoid this, and somehow only run the rdpmc
> > from userspace if scheduled on a P-core.
> >
> > Is there any way to atomically do this? Somehow detect what core we are
> > on and atomically execute a userspace instruction before a core-reschedule
> > can happen?
> >
> > Or barring that, any other way to handle this in a way that won't crash
> > without having to have the users have to bind to a core any time they want
> > to run PAPI?
>
> Can the PAPI rely on the event_idx(), similar to what Andi's pmu-tools
> do? For a stopped event, the index is always 0.

That's not race-free, the task can get migrated to an E core the moment
after you done the load and before the rdpmc instruction.

I suppose you can wrap the whole thing in RSEQ though, it's a bit of a
pain, but RSEQ can be configured to abort on migration.

The very latest libc (2.35+) should have rseq registered by default,
older will have to do so itself -- there is example code in
tools/testing/selftests/rseq but also
https://git.kernel.org/pub/scm/libs/librseq/librseq.git