Re: perf: rdpmc bug when viewing all procs on remote cpu

From: Vince Weaver
Date: Fri Jan 18 2019 - 12:24:29 EST

On Fri, 18 Jan 2019, Peter Zijlstra wrote:
> You can actually use rdpmc when you attach to a CPU, but you have to
> ensure that the userspace component is guaranteed to run on that very
> CPU (sched_setaffinity(2) comes to mind).

unfortunately the HPC people using PAPI would probably be annoyed if we
started binding their threads to cores out from under them.

> The best we could possibly do is put the (target, not current) cpu
> number in the mmap page; but userspace should already know this, for it
> created the event and therefore knows this already.

one other thing the kernel would do is just disable rdpmc (setting index
to 0) in the case where the original perf_event_open() cpu paramater!=0

though that would stop the case where we were on the same CPU from

The issue is currently if you're not careful the rdpmc() interface will
sometimes return plausible (but wrong) results for a cross-CPU rdpmc()
call, even if you are properly falling back to read() on ->index being 0.
It's a bit surprising and it looks like it will take a decent amount of
userspace code to work around the issue, which cuts into the low-overhead
nature of rdpmc.

If the answer is simply this is the way the kernel is going to do it,
that's fine, I just have to add workarounds to PAPI and then get the
perf_even_open() manpage updated to make sure people are aware of the