Re: [RFC PATCH 0/3] restartable sequences v2: fast user-space percpu critical sections

From: Peter Zijlstra
Date: Thu Apr 07 2016 - 16:12:07 EST


On Thu, Apr 07, 2016 at 09:43:33AM -0700, Andy Lutomirski wrote:
> More concretely, this looks like (using totally arbitrary register
> assingments -- probably far from ideal, especially given how GCC's
> constraints work):
>
> enter the critical section:
> 1:
> movq %[cpu], %%r12
> movq {address of counter for our cpu}, %%r13
> movq {some fresh value}, (%%r13)
> cmpq %[cpu], %%r12
> jne 1b
>
> ... do whatever setup or computation is needed...
>
> movq $%l[failed], %%rcx
> movq $1f, %[commit_instr]
> cmpq {whatever counter we chose}, (%%r13)
> jne %l[failed]
> cmpq %[cpu], %%r12
> jne %l[failed]
>
> <-- a signal in here that conflicts with us would clobber (%%r13), and
> the kernel would notice and send us to the failed label
>
> movq %[to_write], (%[target])
> 1: movq $0, %[commit_instr]

And the kernel, for every thread that has had the syscall called and a
thingy registered, needs to (at preempt/signal-setup):

if (get_user(post_commit_ip, current->post_commit_ip))
return -EFAULT;

if (likely(!post_commit_ip))
return 0;

if (regs->ip >= post_commit_ip)
return 0;

if (get_user(seq, (u32 __user *)regs->r13))
return -EFAULT;

if (regs->$(which one holds our chosen seq?) == seq) {
/* nothing changed, do not cancel, proceed to commit. */
return 0;
}

if (put_user(0UL, current->post_commit_ip))
return -EFAULT;

regs->ip = regs->rcx;


> In contrast to Paul's scheme, this has two additional (highly
> predictable) branches and requires generation of a seqcount in
> userspace. In its favor, though, it doesnt need preemption hooks,

Without preemption hooks, how would one thread preempting another at the
above <-- clobber anything and cause the commit to fail?

> it's inherently debuggable,

It is more debuggable, agreed.

> and it allows multiple independent
> rseq-protected things to coexist without forcing each other to abort.

And the kernel only needs to load the second cacheline if it lands in
the middle of a finish block, which should be manageable overhead I
suppose.

But the userspace chunk is lots slower as it needs to always touch
multiple lines, since the @cpu, @seq and @post_commit_ip all live in
separate lines (although I suppose @cpu and @post_commit_ip could live
in the same).

The finish thing needs 3 registers for:

- fail ip
- seq pointer
- seq value

Which I suppose is possible even on register constrained architectures
like i386.