smp_mb__after_spinlock requirement too strong?
From: çæå
Date: Sun Mar 11 2018 - 03:55:50 EST
Peter pointed out in this patch https://patchwork.kernel.org/patch/9771921/
that the spinning-lock used at __schedule() should be RCsc to ensure
visibility of writes prior to __schedule when the task is to be migrated to
another CPU.
And this is emphasized at the comment of the newly introduced
smp_mb__after_spinlock(),
* This barrier must provide two things:
*
* - it must guarantee a STORE before the spin_lock() is ordered against a
* LOAD after it, see the comments at its two usage sites.
*
* - it must ensure the critical section is RCsc.
*
* The latter is important for cases where we observe values written by other
* CPUs in spin-loops, without barriers, while being subject to scheduling.
*
* CPU0 CPU1 CPU2
*
* for (;;) {
* if (READ_ONCE(X))
* break;
* }
* X=1
* <sched-out>
* <sched-in>
* r = X;
*
* without transitivity it could be that CPU1 observes X!=0 breaks the loop,
* we get migrated and CPU2 sees X==0.
which is used at,
__schedule(bool preempt) {
...
rq_lock(rq, &rf);
smp_mb__after_spinlock();
...
}
.
If I didn't miss something, I found this kind of visibility is __not__
necessarily
depends on the spinning-lock at __schedule being RCsc.
In fact, as for runnable task A, the migration would be,
CPU0 CPU1 CPU2
<ACCESS before schedule out A>
lock(rq0)
schedule out A
unock(rq0)
lock(rq0)
remove A from rq0
unlock(rq0)
lock(rq2)
add A into rq2
unlock(rq2)
lock(rq2)
schedule in A
unlock(rq2)
<ACCESS after schedule in A>
<ACCESS before schedule out A> happens-before
unlock(rq0) happends-before
lock(rq0) happends-before
unlock(rq2) happens-before
lock(rq2) happens-before
<ACCESS after schedule in A>
And for stopped tasks,
CPU0 CPU1 CPU2
<ACCESS before schedule out A>
lock(rq0)
schedule out A
remove A from rq0
store-release(A->on_cpu)
unock(rq0)
load_acquire(A->on_cpu)
set_task_cpu(A, 2)
lock(rq2)
add A into rq2
unlock(rq2)
lock(rq2)
schedule in A
unlock(rq2)
<ACCESS after schedule in A>
<ACCESS before schedule out A> happens-before
store-release(A->on_cpu) happens-before
load_acquire(A->on_cpu) happens-before
unlock(rq2) happens-before
lock(rq2) happens-before
<ACCESS after schedule in A>
So, I think the only requirement to smp_mb__after_spinlock is
to guarantee a STORE before the spin_lock() is ordered
against a LOAD after it. So we could remove the RCsc requirement
to allow more efficient implementation.
Did I miss something or this RCsc requirement does not really matter?