Re: [PATCH -V3 -mm] mm, swap: Fix race between swapoff and some swap operations

From: Paul E. McKenney
Date: Tue Dec 19 2017 - 00:36:52 EST


On Tue, Dec 19, 2017 at 09:57:21AM +0800, Huang, Ying wrote:
> "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> writes:
>
> > On Mon, Dec 18, 2017 at 03:41:41PM +0800, Huang, Ying wrote:
> >> "Huang, Ying" <ying.huang@xxxxxxxxx> writes:
> >> And, it appears that if we replace smp_wmb() in _enable_swap_info() with
> >> stop_machine() in some way, we can avoid smp_rmb() in get_swap_device().
> >> This can reduce overhead in normal path further. Can we get same effect
> >> with RCU? For example, use synchronize_rcu() instead of stop_machine()?
> >>
> >> Hi, Paul, can you help me on this?
> >
> > If the key loads before and after the smp_rmb() are within the same
> > RCU read-side critical section, -and- if one of the critical writes is
> > before the synchronize_rcu() and the other critical write is after the
> > synchronize_rcu(), then you normally don't need the smp_rmb().
> >
> > Otherwise, you likely do still need the smp_rmb().
>
> My question may be too general, let make it more specific. For the
> following program,
>
> "
> int a;
> int b;
>
> void intialize(void)
> {
> a = 1;
> synchronize_rcu();
> b = 2;
> }
>
> void test(void)
> {
> int c;
>
> rcu_read_lock();
> c = b;
> /* ignored smp_rmb() */
> if (c)
> pr_info("a=%d\n", a);
> rcu_read_unlock();
> }
> "
>
> Is it possible for it to show
>
> "
> a=0
> "
>
> in kernel log?
>
>
> If it couldn't, this could be a useful usage model of RCU to accelerate
> hot path.

This is not possible, and it can be verified using the Linux kernel
memory model. An introduction to an older version of this model may
be found here (including an introduction to litmus tests and their
output):

https://lwn.net/Articles/718628/
https://lwn.net/Articles/720550/

The litmus test and its output are shown below.

The reason it is not possible is that the entirety of test()'s RCU
read-side critical section must do one of two things:

1. Come before the return from initialize()'s synchronize_rcu().
2. Come after the call to initialize()'s synchronize_rcu().

Suppose test()'s load from "b" sees initialize()'s assignment. Then
some part of test()'s RCU read-side critical section came after
initialize()'s call to synchronize_rcu(), which means that the entirety
of test()'s RCU read-side critical section must come after initialize()'s
call to synchronize_rcu(). Therefore, whenever "c" is non-zero, the
pr_info() must see "a" non-zero.

Thanx, Paul

------------------------------------------------------------------------

C MP-o-sync-o+rl-o-ctl-o-rul

{}

P0(int *a, int *b)
{
WRITE_ONCE(*a, 1);
synchronize_rcu();
WRITE_ONCE(*b, 2);
}

P1(int *a, int *b)
{
int r0;
int r1;

rcu_read_lock();
r0 = READ_ONCE(*b);
if (r0)
r1 = READ_ONCE(*a);
rcu_read_unlock();
}

exists (1:r0=1 /\ 1:r1=0)

------------------------------------------------------------------------

States 2
1:r0=0; 1:r1=0;
1:r0=2; 1:r1=1;
No
Witnesses
Positive: 0 Negative: 2
Condition exists (1:r0=1 /\ 1:r1=0)
Observation MP-o-sync-o+rl-o-ctl-o-rul Never 0 2
Time MP-o-sync-o+rl-o-ctl-o-rul 0.01
Hash=b20eca2da50fa84b15e489502420ff56

------------------------------------------------------------------------

The "Never 0 2" means that the condition cannot happen.