Re: smp_mb__after_spinlock requirement too strong?

From: Boqun Feng
Date: Mon Mar 12 2018 - 01:40:45 EST


On Sun, Mar 11, 2018 at 03:55:41PM +0800, çæå wrote:
> Peter pointed out in this patch https://patchwork.kernel.org/patch/9771921/
> that the spinning-lock used at __schedule() should be RCsc to ensure
> visibility of writes prior to __schedule when the task is to be migrated to
> another CPU.
>
> And this is emphasized at the comment of the newly introduced
> smp_mb__after_spinlock(),
>
> * This barrier must provide two things:
> *
> * - it must guarantee a STORE before the spin_lock() is ordered against a
> * LOAD after it, see the comments at its two usage sites.
> *
> * - it must ensure the critical section is RCsc.
> *
> * The latter is important for cases where we observe values written by other
> * CPUs in spin-loops, without barriers, while being subject to scheduling.
> *
> * CPU0 CPU1 CPU2
> *
> * for (;;) {
> * if (READ_ONCE(X))
> * break;
> * }
> * X=1
> * <sched-out>
> * <sched-in>
> * r = X;
> *
> * without transitivity it could be that CPU1 observes X!=0 breaks the loop,
> * we get migrated and CPU2 sees X==0.
>
> which is used at,
>
> __schedule(bool preempt) {
> ...
> rq_lock(rq, &rf);
> smp_mb__after_spinlock();
> ...
> }
> .
>
> If I didn't miss something, I found this kind of visibility is __not__
> necessarily
> depends on the spinning-lock at __schedule being RCsc.
>
> In fact, as for runnable task A, the migration would be,
>
> CPU0 CPU1 CPU2
>
> <ACCESS before schedule out A>
>
> lock(rq0)
> schedule out A
> unock(rq0)
>
> lock(rq0)
> remove A from rq0
> unlock(rq0)
>
> lock(rq2)
> add A into rq2
> unlock(rq2)
> lock(rq2)
> schedule in A
> unlock(rq2)
>
> <ACCESS after schedule in A>
>
> <ACCESS before schedule out A> happens-before
> unlock(rq0) happends-before
> lock(rq0) happends-before
> unlock(rq2) happens-before
> lock(rq2) happens-before
> <ACCESS after schedule in A>
>

But without RCsc lock, you cannot guarantee that a write propagates to
CPU 0 and CPU 2 at the same time, so the same write may propagate to
CPU0 before <ACCESS before schedule out A> but propagate to CPU 2 after
<ACCESS after scheduler in A>. So..

Regards,
Boqun

> And for stopped tasks,
>
> CPU0 CPU1 CPU2
>
> <ACCESS before schedule out A>
>
> lock(rq0)
> schedule out A
> remove A from rq0
> store-release(A->on_cpu)
> unock(rq0)
>
> load_acquire(A->on_cpu)
> set_task_cpu(A, 2)
>
> lock(rq2)
> add A into rq2
> unlock(rq2)
>
> lock(rq2)
> schedule in A
> unlock(rq2)
>
> <ACCESS after schedule in A>
>
> <ACCESS before schedule out A> happens-before
> store-release(A->on_cpu) happens-before
> load_acquire(A->on_cpu) happens-before
> unlock(rq2) happens-before
> lock(rq2) happens-before
> <ACCESS after schedule in A>
>
> So, I think the only requirement to smp_mb__after_spinlock is
> to guarantee a STORE before the spin_lock() is ordered
> against a LOAD after it. So we could remove the RCsc requirement
> to allow more efficient implementation.
>
> Did I miss something or this RCsc requirement does not really matter?

Attachment: signature.asc
Description: PGP signature