Re: [RFC] perf: Allow fine-grained PMU access control

From: Peter Zijlstra
Date: Tue May 22 2018 - 07:38:13 EST


On Tue, May 22, 2018 at 10:29:29AM +0100, Tvrtko Ursulin wrote:
>
> On 22/05/18 10:05, Peter Zijlstra wrote:
> > On Mon, May 21, 2018 at 10:25:49AM +0100, Tvrtko Ursulin wrote:
> > > From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> > >
> > > For situations where sysadmins might want to allow different level of
> > > of access control for different PMUs, we start creating per-PMU
> > > perf_event_paranoid controls in sysfs.
> >
> > Could you explain how exactly this makes sense?
> >
> > For example, how does it make sense for one PMU to reveal kernel data
> > while another PMU is not allowed.
> >
> > Once you allow one PMU to do so, the secret is out.
> >
> > So please explain, in excruciating detail, how you want to use this and
> > how exactly that makes sense from a security pov.
>
> Not sure it will be excruciating but will try to explain once again.
>
> There are two things:
>
> 1. i915 PMU which exports data such as different engine busyness levels.
> (Perhaps you remember, you helped us implement this from the perf API
> angle.)

Right, but I completely forgot everything again.. So thanks for
reminding.

> 2. Customers who want to look at those stats in production.
>
> They want to use it to answer questions such as:
>
> a) How loaded is my server and can it take one more of X type of job?
> b) What is the least utilised video engine to submit the next packet of work
> to?
> c) What is the least utilised server to schedule the next transcoding job
> on?

On the other hand, do those counters provide enough information for a
side-channel (timing) attack on GPGPU workloads? Because, as you say, it
is a shared resource. So if user A is doing GPGPU crypto, and user B is
observing, might he infer things from the counters?

> Current option for them is to turn off the global paranoid setting which
> then enables unprivileged access to _all_ PMU providers.

Right.

> To me it sounded quite logical that it would be better for the paranoid knob
> to be more fine-grained, so that they can configure their servers so only
> access to needed data is possible.

The proposed semantics are a tad awkward though, the moment you prod at
the sysctl you loose all individual PMU settings. Ideally the per-pmu
would have a special setting that says follow-global in addition to the
existing ones.

> I am not sure what do you mean by "Once you allow one PMU to do so, the
> secret is out."? What secret? Are you implying that enabling unprivileged
> access to i915 engine busyness data opens up access to CPU PMU's as well via
> some side channel?

It was not i915 specific; but if you look at the descriptions:

* perf event paranoia level:
* -1 - not paranoid at all
* 0 - disallow raw tracepoint access for unpriv
* 1 - disallow cpu events for unpriv
* 2 - disallow kernel profiling for unpriv

Then the moment you allow some data to escape, it cannot be put back.
i915 is fairly special in that (afaict) it doesn't leak kernel specific
data

In general I think allowing access to uncore PMUs will leak kernel data.
Thus in general I'm fairly wary of all this.

Is there no other way to expose this information? Can't we do a
traditional load-avg like thing for the GPU?