Re: [PATCH] perf/x86: Reset the counter to prevent the leak for a RDPMC task

From: Liang, Kan
Date: Fri Jul 31 2020 - 14:08:41 EST




On 7/30/2020 12:44 PM, peterz@xxxxxxxxxxxxx wrote:
On Thu, Jul 30, 2020 at 11:54:35AM -0400, Liang, Kan wrote:
On 7/30/2020 8:58 AM, peterz@xxxxxxxxxxxxx wrote:
On Thu, Jul 30, 2020 at 05:38:15AM -0700, kan.liang@xxxxxxxxxxxxxxx wrote:
From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>

The counter value of a perf task may leak to another RDPMC task.

Sure, but nowhere did you explain why that is a problem.

The RDPMC instruction is only available for the X86 platform. Only apply
the fix for the X86 platform.

ARM64 can also do it, although I'm not sure what the current state of
things is here.

After applying the patch,

$ taskset -c 0 ./rdpmc_read_all_counters
index 0x0 value 0x0
index 0x1 value 0x0
index 0x2 value 0x0
index 0x3 value 0x0

index 0x0 value 0x0
index 0x1 value 0x0
index 0x2 value 0x0
index 0x3 value 0x0

You forgot about:

- telling us why it's a problem,

The non-privileged RDPMC user can get the counter information from other
perf users. It is a security issue. I will add it in the next version.

You don't know what it counted and you don't know the offset, what can
you do with it?

We cannot guarantee that an attacker doesn't know what the other thread is doing. Once they know the event name, they may take advantage of the perfmon counters to attack on a cryptosystem.

Here is one paper I googled. https://dl.acm.org/doi/pdf/10.1145/3156015
It mentioned that some events, e.g., cache misses and branch miss, can be used as a side channel to attack on cryptosystems.

There are potential security issues.

- telling us how badly it affects performance.

I once did performance test on a HSX machine. There is no notable slow down
with the patch. I will add the performance data in the next version.

It's still up to [4..8]+[3,4] extra WRMSRs per context switch, that's pretty naf.

I will do more performance test on a ICL with full GP counters and fixed counters enabled.


I would feel much better if we only did this on context switches to
tasks that have RDPMC enabled.

AFAIK, at least for X86, we can only enable/disable RDPMC globally.
How can we know if a specific task that have RDPMC enabled/disabled?

It has mm->context.pref_rdpmc_allowed non-zero, go read x86_pmu_event_{,un}mapped().
Without that CR4.PCE is 0 and RDPMC won't work, which is most of the
actual tasks.


Thanks for pointing it out.

I think I can use event->mmap_count and PERF_X86_EVENT_RDPMC_ALLOWED to check whether the events of the task have RDPMC enabled.

Arguably we should have perf_mmap_open() check if 'event->hw.target ==
current', because without that RDPMC is still pointless. >
So on del() mark the counter dirty (if we don't already have state that
implies this), but don't WRMSR. And then on
__perf_event_task_sched_in(), _after_ programming the new tasks'
counters, check for inactive dirty counters and wipe those -- IFF RDPMC
is on for that task.


The generic code doesn't have counters' information. It looks like we need
to add a new callback to cleanup the dirty counters as below.

In the specific implementation of pmu_cleanup(), we can check and wipe all
inactive dirty counters.

What about pmu::sched_task(), can't we rejig that a little?

The way I'm reading it now, it's like we iterate the task context for
calling perf_event_context_sched_*(), and then iterate a cpuctx list to
find cpuctx->task_ctx, which would be the exact same contexts we've just
iterated.

So can't we pull the pmu::sched_task() call into
perf_event_context_sched_*() ? That would save a round of
pmu_disable/enable() too afaict.


I think it's doable. I will do more test.

Thanks,
Kan