Re: [RFC 2/3] perf/x86: Control RDPMC access from .enable() hook

From: Rob Herring
Date: Mon Aug 30 2021 - 16:58:45 EST


On Sun, Aug 29, 2021 at 10:06 PM Vince Weaver <vincent.weaver@xxxxxxxxx> wrote:
>
> On Fri, 27 Aug 2021, Andy Lutomirski wrote:
>
> > On Thu, Aug 26, 2021, at 12:09 PM, Rob Herring wrote:
>
> > > After testing some scenarios and finding perf_event_tests[1], this
> > > series isn't going to work for x86 unless rdpmc is restricted to task
> > > events only or allowed to segfault on CPU events when read on the
> > > wrong CPU rather than just returning garbage. It's been discussed
> > > before here[2].
> > >
> > > Ultimately, I'm just trying to define the behavior for arm64 where we
> > > don't have an existing ABI to maintain and don't have to recreate the
> > > mistakes of x86 rdpmc ABI. Tying the access to mmap is messy. As we
> > > explicitly request user access on perf_event_open(), I think it may be
> > > better to just enable access when the event's context is active and
> > > ignore mmap(). Maybe you have an opinion there since you added the
> > > mmap() part?
> >
> > That makes sense to me. The mmap() part was always a giant kludge.
> >
> > There is fundamentally a race, at least if rseq isn’t used: if you check
> > that you’re on the right CPU, do RDPMC, and throw out the result if you
> > were on the wrong CPU (determined by looking at the mmap), you still
> > would very much prefer not to fault.
> >
> > Maybe rseq or a vDSO helper is the right solution for ARM.

There was a version using rseq[1]. AIUI, that would solve the reading
from the wrong CPU problem. I don't think using rseq would change the
kernel implementation other than whether we enable events on specific
CPUs.

> as the author of those perf_event tests for rdpmc, I have to say if ARM
> comes up with a cleaner implementation I'd be glad to have x86 transition
> to something better.

Thanks for chiming in.

My plan is to be more restricted in terms of what works, and fail or
disable user access for what's not supported. Unless I hear events on
specific CPUs is really important, that means only monitoring of a
thread on all (for big.LITTLE, all homogeneous) CPUs is supported.
That doesn't require a better/cleaner interface. It just means cpu
must be -1 for perf_event_open if you want rdpmc. The difference on
Arm is just that we can enforce/indicate that.

We could also enable CPU events, but abort if read on the wrong CPU.
The user in that case either has to control the thread affinity or
possibly use rseq.

> The rdpmc code is a huge mess and has all kinds of corner cases. I'm not
> sure anyone besides the PAPI library tries to use it, and while it's a
> nice performance improvement to use rdpmc it is really hard to get things
> working right.

Yes, I've been reading thru the bugs you reported and related tests. I
just wish I found them sooner...

> As a PAPI developer we actually have run into the issue where the CPU
> switches and we were reporting the wrong results. Also if I recall (it's
> been a while) we were having issues where the setup lets you attach to a
> process on another CPU for monitoring using the rdpmc interface and it
> returns results even though I think that will rarely ever work in
> practice.

Returning the wrong results is obviously bad for the user, but making
that "work" also complicates the kernel implementation.

Rob

[1] https://x-lore.kernel.org/all/20190611125315.18736-4-raphael.gault@xxxxxxx/