Re: [RFC PATCH for 4.15 v12 00/22] Restartable sequences and CPU op vector

From: Mathieu Desnoyers
Date: Thu Nov 23 2017 - 19:03:43 EST


----- On Nov 23, 2017, at 6:38 PM, Thomas Gleixner tglx@xxxxxxxxxxxxx wrote:

> On Thu, 23 Nov 2017, Mathieu Desnoyers wrote:
>> ----- On Nov 23, 2017, at 5:51 PM, Thomas Gleixner tglx@xxxxxxxxxxxxx wrote:
>> > On Thu, 23 Nov 2017, Mathieu Desnoyers wrote:
>> >> ----- On Nov 22, 2017, at 2:37 PM, Will Deacon will.deacon@xxxxxxx wrote:
>> >> > On Wed, Nov 22, 2017 at 08:32:19PM +0100, Peter Zijlstra wrote:
>> >> >>
>> >> >> So what exactly is the problem of leaving out the whole cpu_opv thing
>> >> >> for now? Pure rseq is usable -- albeit a bit cumbersome without
>> >> >> additional debugger support.
>> >> >
>> >> > Drive-by "ack" to that. I'd really like a working rseq implementation in
>> >> > mainline, but I don't much care for another interpreter.
>> >>
>> >> Considering the arm 64 use-case of reading PMU counters from user-space
>> >> using rseq to prevent migration, I understand that you're lucky enough to
>> >> already have a system call at your disposal that can perform the slow-path
>> >> in case of single-stepping.
>> >>
>> >> So yes, your particular case is already covered, but unfortunately that's
>> >> not the same situation for other use-cases that have been expressed.
>> >
>> > If we have users of rseq which can do without the other muck, then what's
>> > the reason not to support it?
>> >
>> > The sysops thing can be sorted out on top and the use cases which need both
>> > will have to test for both syscalls being available anyway.
>>
>> I'm currently making sure CONFIG_RSEQ selects both CONFIG_CPU_OPV and
>> CONFIG_MEMBARRIER, so the user-space fast-paths don't end up with
>> various ways of doing the fallback/single-stepping/memory barrier handling
>> depending on whether the kernel support each of those individually.
>> So first of all, it reduces complexity from a user-space perspective.
>>
>> Moreover, with a single already needed cpu_id vs cpu_id_start field comparison
>> in the rseq fast-path, user-space knows that it can rely on having rseq,
>> cpu_opv, and membarrier. Without this guarantee, user-space would have to
>> detect individually whether each of those system calls is available, and
>> test flags on the fast-path, for additional overhead.
>
> You have to test for sys_rseq somewhere in the init code. So you can test
> for the other two being fully functional as well.
>
> If one of them is missing then you can avoid that rseq fastpath either
> completely or because you never registered that rseq muck your rseq will
> just contain stale init data which kicks you into some slowpath fallback
> code.

That would work if we could have more than one rseq TLS entry per thread.
If it would be the case, then e.g. lttng-ust could own its own rseq
TLS and do just as you explain above.

It's not the case with the current proposal. This means multiple user
libraries will have to share the same cpu_id and cpu_id_start fields,
which breaks your proposed new-app/old-kernel backward compatibility
check proposal.

For instance, if glibc librseq.so happily registers rseq (and does not
care about testing for cpu_opv or membarrier availability), then
lttng-ust cannot leave stale rseq init data which kicks in its slowpath
fallback.

>
> You need something like this anyway unless you plan to ship code which
> cannot run on systems w/o rseq support at all.

My plan is to ensure that testing for

(TLS::rseq->cpu_id_start == TLS::rseq->cpu_id)

should be enough for fast-paths to guarantee that:

- rseq is available and registered for the current thread,
- cpu_opv is available as fallback,
- membarrier private_expedited and shared_expedited are available.

>
> Either you designed your thing wrong or you try to create an artifical
> dependency for political reasons.

Having the rseq TLS shared across multiple library/app users within a
single process does limit our options there. :-/

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com