Re: [PATCH 3/7] perf: arm64: Use rseq to test userspace access to pmu counters

From: Raphael Gault
Date: Thu Jun 13 2019 - 12:42:48 EST


Hi Mathieu,

On 6/11/19 8:33 PM, Mathieu Desnoyers wrote:
----- On Jun 11, 2019, at 6:57 PM, Mark Rutland mark.rutland@xxxxxxx wrote:

Hi Arnaldo,

On Tue, Jun 11, 2019 at 11:33:46AM -0300, Arnaldo Carvalho de Melo wrote:
Em Tue, Jun 11, 2019 at 01:53:11PM +0100, Raphael Gault escreveu:
Add an extra test to check userspace access to pmu hardware counters.
This test doesn't rely on the seqlock as a synchronisation mechanism but
instead uses the restartable sequences to make sure that the thread is
not interrupted when reading the index of the counter and the associated
pmu register.

In addition to reading the pmu counters, this test is run several time
in order to measure the ratio of failures:
I ran this test on the Juno development platform, which is big.LITTLE
with 4 Cortex A53 and 2 Cortex A57. The results vary quite a lot
(running it with 100 tests is not so long and I did it several times).
I ran it once with 10000 iterations:
`runs: 10000, abort: 62.53%, zero: 34.93%, success: 2.54%`

Signed-off-by: Raphael Gault <raphael.gault@xxxxxxx>
---
tools/perf/arch/arm64/include/arch-tests.h | 5 +-
tools/perf/arch/arm64/include/rseq-arm64.h | 220 ++++++++++++++++++

So, I applied the first patch in this series, but could you please break
this patch into at least two, one introducing the facility
(include/rseq*) and the second adding the test?

We try to enforce this kind of granularity as down the line we may want
to revert one part while the other already has other uses and thus
wouldn't allow a straight revert.

Also, can this go to tools/arch/ instead? Is this really perf specific?
Isn't there any arch/arm64/include files for the kernel that we could
mirror and have it checked for drift in tools/perf/check-headers.sh?

The rseq bits aren't strictly perf specific, and I think the existing
bits under tools/testing/selftests/rseq/ could be factored out to common
locations under tools/include/ and tools/arch/*/include/.

Hi Mark,

Thanks for CCing me!

Or into a stand-alone librseq project:

https://github.com/compudj/librseq (currently a development branch in
my own github)

I don't see why this user-space code should sit in the kernel tree.
It is not tooling-specific.


From a scan, those already duplicate barriers and other helpers which
already have definitions under tools/, which seems unfortunate. :/

Comments below are for Raphael and Matthieu.

[...]

+static u64 noinline mmap_read_self(void *addr, int cpu)
+{
+ struct perf_event_mmap_page *pc = addr;
+ u32 idx = 0;
+ u64 count = 0;
+
+ asm volatile goto(
+ RSEQ_ASM_DEFINE_TABLE(0, 1f, 2f, 3f)
+ "nop\n"
+ RSEQ_ASM_STORE_RSEQ_CS(1, 0b, rseq_cs)
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f)
+ RSEQ_ASM_OP_R_LOAD(pc_idx)
+ RSEQ_ASM_OP_R_AND(0xFF)
+ RSEQ_ASM_OP_R_STORE(idx)
+ RSEQ_ASM_OP_R_SUB(0x1)
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f)
+ "msr pmselr_el0, " RSEQ_ASM_TMP_REG "\n"
+ "isb\n"
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f)

I really don't understand why the cpu_id needs to be compared 3 times
here (?!?)

Explicit comparison of the cpu_id within the rseq critical section
should be done _once_.


I understand and that's what I thought as well but I got confused with a comment in (src)/include/uapi/linux/rseq.h which states:
> This CPU number value should always be compared
> against the value of the cpu_id field before performing a rseq
> commit or returning a value read from a data structure indexed
> using the cpu_id_start value.

I'll remove the unnecessary testing.


If the kernel happens to preempt and migrate the thread while in the
critical section, it's the kernel's job to move user-space execution
to the abort handler.

[...]

Thanks,

--
Raphael Gault