Re: perf: rdpmc bug when viewing all procs on remote cpu

From: Peter Zijlstra
Date: Fri Jan 18 2019 - 11:09:54 EST


On Fri, Jan 18, 2019 at 09:09:04AM -0500, Vince Weaver wrote:
> On Fri, 18 Jan 2019, Peter Zijlstra wrote:
>
> > On Fri, Jan 11, 2019 at 04:52:22PM -0500, Vince Weaver wrote:
> > > On Thu, 10 Jan 2019, Vince Weaver wrote:
> > >
> > > > On Thu, 10 Jan 2019, Vince Weaver wrote:
> > > >
> > > > > On Thu, 10 Jan 2019, Vince Weaver wrote:
> > > > >
> > > > > > However if you create an all-process attached to CPU event:
> > > > > > perf_event_open(attr, -1, X, -1, 0);
> > > > > > the mmap event index is set as if this were a valid event and so the rdpmc
> > > > > > succeeds even though it shouldn't (we're trying to read an event value
> > > > > > on a remote cpu with a local rdpmc).
> > >
> > > so on further looking at the code, it doesn't appear that rdpmc events are
> > > explicitly marked as unavailable in the attach-cpu or attach-pid case,
> > > it's just by luck the check for PERF_EVENT_STATE_ACTIVE catches most of
> > > the cases?
> > >
> > > should an explicit check be added to zero out userpg->index in cases where
> > > the event being measured is running on a different core?
> >
> > And how would we konw? We don't know what CPU will be observing the
> > mmap().
> >
> > RDPMC fundamentally only makes sense on 'self' (either task or CPU).
>
> so is this a "don't do that then" thing and I should have PAPI
> userspace avoid using rdpmc() whenever a proc/cpu was attached to?

You can actually use rdpmc when you attach to a CPU, but you have to
ensure that the userspace component is guaranteed to run on that very
CPU (sched_setaffinity(2) comes to mind).

> Or is there a way to have the kernel indicate this? Does the kernel track
> current CPU and original CPU of the mmap and could zero out the index
> field in this case? Or would that add too much overhead?

Impossible I'm afraid. Memory is not associated with a CPU; it's
contents is the same whatever CPU reads from it.

The best we could possibly do is put the (target, not current) cpu
number in the mmap page; but userspace should already know this, for it
created the event and therefore knows this already.