Re: [RFC PATCH for 4.15 v12 00/22] Restartable sequences and CPU op vector
From: Thomas Gleixner
Date: Thu Nov 23 2017 - 18:39:23 EST
On Thu, 23 Nov 2017, Mathieu Desnoyers wrote:
> ----- On Nov 23, 2017, at 5:51 PM, Thomas Gleixner tglx@xxxxxxxxxxxxx wrote:
> > On Thu, 23 Nov 2017, Mathieu Desnoyers wrote:
> >> ----- On Nov 22, 2017, at 2:37 PM, Will Deacon will.deacon@xxxxxxx wrote:
> >> > On Wed, Nov 22, 2017 at 08:32:19PM +0100, Peter Zijlstra wrote:
> >> >>
> >> >> So what exactly is the problem of leaving out the whole cpu_opv thing
> >> >> for now? Pure rseq is usable -- albeit a bit cumbersome without
> >> >> additional debugger support.
> >> >
> >> > Drive-by "ack" to that. I'd really like a working rseq implementation in
> >> > mainline, but I don't much care for another interpreter.
> >>
> >> Considering the arm 64 use-case of reading PMU counters from user-space
> >> using rseq to prevent migration, I understand that you're lucky enough to
> >> already have a system call at your disposal that can perform the slow-path
> >> in case of single-stepping.
> >>
> >> So yes, your particular case is already covered, but unfortunately that's
> >> not the same situation for other use-cases that have been expressed.
> >
> > If we have users of rseq which can do without the other muck, then what's
> > the reason not to support it?
> >
> > The sysops thing can be sorted out on top and the use cases which need both
> > will have to test for both syscalls being available anyway.
>
> I'm currently making sure CONFIG_RSEQ selects both CONFIG_CPU_OPV and
> CONFIG_MEMBARRIER, so the user-space fast-paths don't end up with
> various ways of doing the fallback/single-stepping/memory barrier handling
> depending on whether the kernel support each of those individually.
> So first of all, it reduces complexity from a user-space perspective.
>
> Moreover, with a single already needed cpu_id vs cpu_id_start field comparison
> in the rseq fast-path, user-space knows that it can rely on having rseq,
> cpu_opv, and membarrier. Without this guarantee, user-space would have to
> detect individually whether each of those system calls is available, and
> test flags on the fast-path, for additional overhead.
You have to test for sys_rseq somewhere in the init code. So you can test
for the other two being fully functional as well.
If one of them is missing then you can avoid that rseq fastpath either
completely or because you never registered that rseq muck your rseq will
just contain stale init data which kicks you into some slowpath fallback
code.
You need something like this anyway unless you plan to ship code which
cannot run on systems w/o rseq support at all.
Either you designed your thing wrong or you try to create an artifical
dependency for political reasons.
Thanks,
tglx