Re: [RFC PATCH for 4.17 02/21] rseq: Introduce restartable sequences system call (v12)

From: Mathieu Desnoyers
Date: Wed Mar 28 2018 - 16:19:50 EST


----- On Mar 28, 2018, at 1:49 PM, Peter Zijlstra peterz@xxxxxxxxxxxxx wrote:

> On Wed, Mar 28, 2018 at 11:37:06AM -0400, Mathieu Desnoyers wrote:
>> ----- On Mar 28, 2018, at 11:28 AM, Peter Zijlstra peterz@xxxxxxxxxxxxx wrote:
>>
>> > On Wed, Mar 28, 2018 at 11:14:05AM -0400, Mathieu Desnoyers wrote:
>> >
>> >> > If at all possible I would make it SIGSEGV when issueing SYSCALL()s from
>> >> > within an RSEQ.
>> >>
>> >> What's the goal there ? rseq critical sections can technically do system calls
>> >> if they wish. Why prevent this ?
>> >
>> > This all started as a way to do 'small' _fast_ per-cpu ops, System calls
>> > do NOT fit in that pattern. If you're willing to do a system calls the
>> > cost of atomics is not a problem.
>>
>> I'm not arguing that a typical rseq would do a system call. I'm merely
>> pointing out that if we start putting arbitrary limitations like "SIGSEGV
>> when a fork or system call is encountered on top of rseq", this will cause
>> pain in user-space.
>
> I don't think disallowing system calls is arbitrary. And I think that is
> something we really want to enforce, because it's batshit insane to
> allow.
>
> And if we allow now, people _will_ use it and we can't ever take it
> away again.

Here are some examples of how I would like to use system calls within
rseq critical sections, for testing purposes:

- Issue poll(NULL, 0, ms_timeout) from a rseq critical section, to introduce
a delay in the critical section and test the effect,
- Issue sched_yield() from a rseq critical section, to introduce preemption at
that point,
- Issue kill() on self, thus testing interruption by signals over rseq c.s.,
- Invoke sched_setaffinity to tweak the cpu affinity mask to force thread
migration within a rseq c.s.

I currently have only implemented the poll(), sched_yield() and kill()
test-cases outside of the rseq critical sections, instead relying on
assembly loops to introduce delays in rseq c.s.. However, if we disallow
system calls in rseq critical sections, I'll never be able to use those
systems calls to extend the test matrix.

I see other use-cases where having a system call in a rseq critical section
could make sense: if vDSO data shared between kernel and user-space rely
on rseq for synchronization, but a fallback sometimes needs to issue a system
call for part of the operation.

Therefore I'd really want to keep allowing system calls within rseq critical
sections, even though we don't expect this to be the typical use-case.

Thanks,

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com