Re: Plain accesses and data races in the Linux Kernel Memory Model

From: Andrea Parri
Date: Wed Jan 16 2019 - 16:37:12 EST


[...]

> The difficulty with incorporating plain accesses in the memory model
> is that the compiler has very few constraints on how it treats plain
> accesses. It can eliminate them, duplicate them, rearrange them,
> merge them, split them up, and goodness knows what else. To make some
> sense of this, I have taken the view that a plain access can exist
> (perhaps multiple times) within a certain bounded region of code.
> Ordering of two accesses X and Y means that we guarantee at least one
> instance of the X access must be executed before any instances of the
> Y access. (This is assuming that neither of the accesses is
> completely eliminated by the compiler; otherwise there is nothing to
> order!)
>
> After adding some simple definitions for the sets of plain and marked
> accesses and for compiler barriers, the patch updates the ppo
> relation. The basic idea here is that ppo can be broken down into
> categories: memory barriers, overwrites, and dependencies (including
> dep-rfi).
>
> Memory barriers always provide ordering (compiler barriers do
> not but they have indirect effects).
>
> Overwriting always provides ordering. This may seem
> surprising in the case where both X and Y are plain writes,
> but in that case the memory model will say that X can be
> eliminated unless there is at least a compiler barrier between
> X and Y, and this barrier will enforce the ordering.
>
> Some dependencies provide ordering and some don't. Going by
> cases:
>
> An address dependency to a read provides ordering when
> the source is a marked read, even when the target is a
> plain read. This is necessary if rcu_dereference() is
> to work correctly; it is tantamount to assuming that
> the compiler never speculates address dependencies.
> However, if the source is a plain read then there is
> no ordering. This is because of Alpha, which does not
> respect address dependencies to reads (on Alpha,
> marked reads include a memory barrier to enforce the
> ordering but plain reads do not).

Can the compiler (maybe, it does?) transform, at the C or at the "asm"
level, LB1's P0 in LB2's P0 (LB1 and LB2 are reported below)?

C LB1

{
int *x = &a;
}

P0(int **x, int *y)
{
int *r0;

r0 = rcu_dereference(*x);
*r0 = 0;
smp_wmb();
WRITE_ONCE(*y, 1);
}

P1(int **x, int *y, int *b)
{
int r0;

r0 = READ_ONCE(*y);
rcu_assign_pointer(*x, b);
}

exists (0:r0=b /\ 1:r0=1)


C LB2

{
int *x = &a;
}

P0(int **x, int *y)
{
int *r0;

r0 = rcu_dereference(*x);
if (*r0)
*r0 = 0;
smp_wmb();
WRITE_ONCE(*y, 1);
}

P1(int **x, int *y, int *b)
{
int r0;

r0 = READ_ONCE(*y);
rcu_assign_pointer(*x, b);
}

exists (0:r0=b /\ 1:r0=1)

LB1 and LB2 are data-race free, according to the patch; LB1's "exists"
clause is not satisfiable, while LB2's "exists" clause is satisfiable.

I'm adding Nick to Cc (I never spoke with him, but from what I see in
LKML, he must understand compiler better than I currently do... ;-/ )

Andrea


>
> An address dependency to a write always provides
> ordering. Neither the compiler nor the CPU can
> speculate the address of a write, because a wrong
> guess could generate a data race. (Question: do we
> need to include the case where the source is a plain
> read?)
>
> A data or control dependency to a write provides
> ordering if the target is a marked write. This is
> because the compiler is obliged to translate a marked
> write as a single machine instruction; if it
> speculates such a write there will be no opportunity
> to correct a mistake.
>
> Dep-rfi (i.e., a data or address dependency from a
> read to a write which is then read from on the same
> CPU) provides ordering between the two reads if the
> target is a marked read. This is again because the
> marked read will be translated as a machine-level load
> instruction, and then the CPU will guarantee the
> ordering.
>
> There is a special case (data;rfi) that doesn't
> provide ordering in itself but can contribute to other
> orderings. A data;rfi link corresponds to situations
> where a value is stored in a temporary shared variable
> and then loaded back again. Since the compiler might
> choose to eliminate the temporary, its accesses can't
> be said to be ordered -- but the accesses around it
> might be. As a simple example, consider:
>
> r1 = READ_ONCE(ptr);
> tmp = r1;
> r2 = tmp;
> WRITE_ONCE(*r2, 5);
>
> The plain accesses involving tmp don't have any
> particular ordering requirements, but we do know that
> the READ_ONCE must be ordered before the WRITE_ONCE.
> The chain of relations is:
>
> [marked] ; data ; rfi ; addr ; [marked]
>
> showing that a data;rfi has been inserted into an
> address dependency from a marked read to a marked
> write. In general, any number of data;rfi links can
> be inserted in each of the other kinds of dependencies.