On Mon, Jan 20, 2025 at 11:44:37AM -0500, Liang, Kan wrote:
On 2025-01-17 5:04 p.m., Vince Weaver wrote:
Hello
so we've been working on PAPI support for Intel Top-Down events, which
let's say does "exciting" things involving the rdpmc instruction.
One issue we are having is that on a hybrid machine (Raptor Lake in this
case with performance/efficiency cores) there is no top-down support
for the E-cores, and it will gpf/segfault if you try to rdpmc the top-down
events.
Obviously PAPI would like to avoid this, and somehow only run the rdpmc
from userspace if scheduled on a P-core.
Is there any way to atomically do this? Somehow detect what core we are
on and atomically execute a userspace instruction before a core-reschedule
can happen?
Or barring that, any other way to handle this in a way that won't crash
without having to have the users have to bind to a core any time they want
to run PAPI?
Can the PAPI rely on the event_idx(), similar to what Andi's pmu-tools
do? For a stopped event, the index is always 0.
That's not race-free, the task can get migrated to an E core the moment
after you done the load and before the rdpmc instruction.
I suppose you can wrap the whole thing in RSEQ though, it's a bit of a
pain, but RSEQ can be configured to abort on migration.
The very latest libc (2.35+) should have rseq registered by default,
older will have to do so itself -- there is example code in
tools/testing/selftests/rseq but also
https://git.kernel.org/pub/scm/libs/librseq/librseq.git