Re: [RFC PATCH v7 1/7] Restartable sequences system call

From: Boqun Feng
Date: Thu Aug 04 2016 - 00:27:07 EST


On Wed, Aug 03, 2016 at 09:37:57AM -0700, Andy Lutomirski wrote:
> On Wed, Aug 3, 2016 at 5:27 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > On Tue, Jul 26, 2016 at 03:02:19AM +0000, Mathieu Desnoyers wrote:
> >> We really care about preemption here. Every migration implies a
> >> preemption from a user-space perspective. If we would only care
> >> about keeping the CPU id up-to-date, hooking into migration would be
> >> enough. But since we want atomicity guarantees for restartable
> >> sequences, we need to hook into preemption.
> >
> >> It allows user-space to perform update operations on per-cpu data without
> >> requiring heavy-weight atomic operations.
> >
> > Well, a CMPXCHG without LOCK prefix isn't all that expensive on x86.
> >
> > It is however on PPC and possibly other architectures, so in name of
> > simplicity supporting only the one variant makes sense.
> >
>
> I wouldn't want to depend on CMPXCHG. But imagine we had primitives
> that were narrower than the full abort-on-preemption primitive.
> Specifically, suppose we had abort if (actual cpu != expected_cpu ||
> *aptr != aval). We could do things like:
>
> expected_cpu = cpu;
> aval = NULL; // disarm for now
> begin();
> aval = event_count[cpu] + 1;
> event_count[cpu] = aval;
> event_count[cpu]++;

This line is redundant, right? Because it will guarantee a failure even
in no-contention cases.

>
> ... compute something ...
>
> // arm the rest of it
> aptr = &event_count[cpu];
> if (*aptr != aval)
> goto fail;
>
> *thing_im_writing = value_i_computed;
> end();
>
> The idea here is that we don't rely on the scheduler to increment the
> event count at all, which means that we get to determine the scope of
> what kinds of access conflicts we care about ourselves.
>

If we increase the event count in userspace, how could we prevent two
userspace threads from racing on the event_count[cpu] field? For
example:

CPU 0
================
{event_count[0] is initially 0}

[Thread 1]
begin();
aval = event_count[cpu] + 1; // 1

(preempted)
[Thread 2]
begin();
aval = event_count[cpu] + 1; // 1, too
event_count[cpu] = aval; // event_count[0] is 1

(preempted)
[Thread 1]
event_count[cpu] = aval; // event_count[0] is 1

...

aptr = &event_count[cpu];
if (*aptr != aval) // false.
...

[Thread 2]
aptr = &event_count[cpu];
if (*aptr != aval) // false.
...

, in which case, both the critical sections are successful, and Thread 1
and Thread 2 will race on *thing_im_writing.

Am I missing your point here?

Regards,
Boqun

> This has an obvious downside: it's more complicated.
>
> It has several benefits, I think. It's debuggable without hassle
> (unless someone, accidentally or otherwise, sets aval incorrectly).
> It also allows much longer critical sections to work well, as merely
> being preempted in the middle won't cause an abort any more.
>
> So I'm hoping to understand whether we could make something like this
> work. This whole thing is roughly equivalent to abort-if-migrated
> plus an atomic "if (*aptr == aval) *b = c;" operation.
>
> (I think that, if this worked, we could improve it a bit by making the
> abort operation jump back to the "if (*aptr != aval) goto fail;" code,
> which should reduce the scope for error a bit and also reduces the
> need for extra code paths that only execute on an abort.)

Attachment: signature.asc
Description: PGP signature