Re: perf: is it possible to userspace rdpmc but only on a certain core type

From: Mathieu Desnoyers
Date: Tue Jan 21 2025 - 09:30:53 EST


On 2025-01-21 07:52, Peter Zijlstra wrote:
On Mon, Jan 20, 2025 at 11:44:37AM -0500, Liang, Kan wrote:


On 2025-01-17 5:04 p.m., Vince Weaver wrote:
Hello

so we've been working on PAPI support for Intel Top-Down events, which
let's say does "exciting" things involving the rdpmc instruction.

One issue we are having is that on a hybrid machine (Raptor Lake in this
case with performance/efficiency cores) there is no top-down support
for the E-cores, and it will gpf/segfault if you try to rdpmc the top-down
events.

Obviously PAPI would like to avoid this, and somehow only run the rdpmc
from userspace if scheduled on a P-core.

Is there any way to atomically do this? Somehow detect what core we are
on and atomically execute a userspace instruction before a core-reschedule
can happen?

Or barring that, any other way to handle this in a way that won't crash
without having to have the users have to bind to a core any time they want
to run PAPI?

Can the PAPI rely on the event_idx(), similar to what Andi's pmu-tools
do? For a stopped event, the index is always 0.

That's not race-free, the task can get migrated to an E core the moment
after you done the load and before the rdpmc instruction.

I suppose you can wrap the whole thing in RSEQ though, it's a bit of a
pain, but RSEQ can be configured to abort on migration.

The very latest libc (2.35+) should have rseq registered by default,
older will have to do so itself -- there is example code in
tools/testing/selftests/rseq but also
https://git.kernel.org/pub/scm/libs/librseq/librseq.git

Indeed, you could start from a copy of this function:

https://git.kernel.org/pub/scm/libs/librseq/librseq.git/tree/include/rseq/arch/x86/bits.h#n161

and tweak it to issue "rdpmc" rather than "addq", thus creating a helper
such as:

int rseq_try_rdpmc(params..., int cpu);
(e.g. return 0 on success, -1 on abort)

and use it as such from C (untested code snippet):

static inline bool rseq_rdpmc(params...)
{
bool rdpmc_issued = false;

for (;;) {
int cpu = rseq_current_cpu();

if (!cpu_is_p_core(cpu))
break;
if (!rseq_try_rdpmc(params..., cpu)) {
rdpmc_issued = true;
break;
}
}
return rdpmc_issued;
}

The rseq critical section in rseq_try_rdpmc will either abort if migrated
elsewhere, else it will issue the rdpmc instruction if it is still on the
right cpu when the instruction is executed.

Thanks,

Mathieu




--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com