Re: [PATCH 3/7] perf: arm64: Use rseq to test userspace access to pmu counters

From: Mathieu Desnoyers
Date: Thu Jun 13 2019 - 12:35:55 EST


----- On Jun 13, 2019, at 9:10 AM, Raphael Gault raphael.gault@xxxxxxx wrote:

> Hi Mathieu,
>
> On 6/11/19 8:33 PM, Mathieu Desnoyers wrote:
>> ----- On Jun 11, 2019, at 6:57 PM, Mark Rutland mark.rutland@xxxxxxx wrote:
>>
>>> Hi Arnaldo,
>>>
>>> On Tue, Jun 11, 2019 at 11:33:46AM -0300, Arnaldo Carvalho de Melo wrote:
>>>> Em Tue, Jun 11, 2019 at 01:53:11PM +0100, Raphael Gault escreveu:
>>>>> Add an extra test to check userspace access to pmu hardware counters.
>>>>> This test doesn't rely on the seqlock as a synchronisation mechanism but
>>>>> instead uses the restartable sequences to make sure that the thread is
>>>>> not interrupted when reading the index of the counter and the associated
>>>>> pmu register.
>>>>>
>>>>> In addition to reading the pmu counters, this test is run several time
>>>>> in order to measure the ratio of failures:
>>>>> I ran this test on the Juno development platform, which is big.LITTLE
>>>>> with 4 Cortex A53 and 2 Cortex A57. The results vary quite a lot
>>>>> (running it with 100 tests is not so long and I did it several times).
>>>>> I ran it once with 10000 iterations:
>>>>> `runs: 10000, abort: 62.53%, zero: 34.93%, success: 2.54%`
>>>>>
>>>>> Signed-off-by: Raphael Gault <raphael.gault@xxxxxxx>
>>>>> ---
>>>>> tools/perf/arch/arm64/include/arch-tests.h | 5 +-
>>>>> tools/perf/arch/arm64/include/rseq-arm64.h | 220 ++++++++++++++++++
>>>>
>>>> So, I applied the first patch in this series, but could you please break
>>>> this patch into at least two, one introducing the facility
>>>> (include/rseq*) and the second adding the test?
>>>>
>>>> We try to enforce this kind of granularity as down the line we may want
>>>> to revert one part while the other already has other uses and thus
>>>> wouldn't allow a straight revert.
>>>>
>>>> Also, can this go to tools/arch/ instead? Is this really perf specific?
>>>> Isn't there any arch/arm64/include files for the kernel that we could
>>>> mirror and have it checked for drift in tools/perf/check-headers.sh?
>>>
>>> The rseq bits aren't strictly perf specific, and I think the existing
>>> bits under tools/testing/selftests/rseq/ could be factored out to common
>>> locations under tools/include/ and tools/arch/*/include/.
>>
>> Hi Mark,
>>
>> Thanks for CCing me!
>>
>> Or into a stand-alone librseq project:
>>
>> https://github.com/compudj/librseq (currently a development branch in
>> my own github)
>>
>> I don't see why this user-space code should sit in the kernel tree.
>> It is not tooling-specific.
>>
>>>
>>> From a scan, those already duplicate barriers and other helpers which
>>> already have definitions under tools/, which seems unfortunate. :/
>>>
>>> Comments below are for Raphael and Matthieu.
>>>
>>> [...]
>>>
>>>>> +static u64 noinline mmap_read_self(void *addr, int cpu)
>>>>> +{
>>>>> + struct perf_event_mmap_page *pc = addr;
>>>>> + u32 idx = 0;
>>>>> + u64 count = 0;
>>>>> +
>>>>> + asm volatile goto(
>>>>> + RSEQ_ASM_DEFINE_TABLE(0, 1f, 2f, 3f)
>>>>> + "nop\n"
>>>>> + RSEQ_ASM_STORE_RSEQ_CS(1, 0b, rseq_cs)
>>>>> + RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f)
>>>>> + RSEQ_ASM_OP_R_LOAD(pc_idx)
>>>>> + RSEQ_ASM_OP_R_AND(0xFF)
>>>>> + RSEQ_ASM_OP_R_STORE(idx)
>>>>> + RSEQ_ASM_OP_R_SUB(0x1)
>>>>> + RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f)
>>>>> + "msr pmselr_el0, " RSEQ_ASM_TMP_REG "\n"
>>>>> + "isb\n"
>>>>> + RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f)
>>
>> I really don't understand why the cpu_id needs to be compared 3 times
>> here (?!?)
>>
>> Explicit comparison of the cpu_id within the rseq critical section
>> should be done _once_.
>>
>
> I understand and that's what I thought as well but I got confused with a
> comment in (src)/include/uapi/linux/rseq.h which states:
> > This CPU number value should always be compared
> > against the value of the cpu_id field before performing a rseq
> > commit or returning a value read from a data structure indexed
> > using the cpu_id_start value.
>
> I'll remove the unnecessary testing.

It needs to be compared at least once, yes. But once is enough.

Thanks,

Mathieu

>
>
>> If the kernel happens to preempt and migrate the thread while in the
>> critical section, it's the kernel's job to move user-space execution
>> to the abort handler.
>>
> [...]
>
> Thanks,
>
> --
> Raphael Gault

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com