Re: rcu_read_lock lost its compiler barrier
From: Paul E. McKenney
Date: Mon Jun 03 2019 - 04:46:10 EST
On Mon, Jun 03, 2019 at 12:23:39AM -0700, Paul E. McKenney wrote:
> On Mon, Jun 03, 2019 at 12:01:14PM +0800, Herbert Xu wrote:
> > On Sun, Jun 02, 2019 at 08:47:07PM -0700, Paul E. McKenney wrote:
> > >
> > > CPU2: if (b != 1)
> > > CPU2: b = 1;
> > Stop right there. The kernel is full of code that assumes that
> > assignment to an int/long is atomic. If your compiler breaks this
> > assumption that we can kiss the kernel good-bye.
> Here you go:
> TL;DR: On x86, of you are doing a plain store of a 32-bit constant
> that has bits set only in the lower few bits of each 16-bit half of
> that constant, the compiler is plenty happy to use a pair of 16-bit
> store-immediate instructions to carry out that store. This is also
> known as "store tearing".
> The two bugs were filed (and after some back and forth, fixed) because
> someone forgot to exclude C11 atomics and volatile accesses from this
> store tearing.
I should hasten to add that I have not seen load tearing, nor have I seen
store tearing when storing a value unknown to the compiler. However,
plain C-language loads and stores can be invented, fused, and a few other
"interesting" optimization can be applied.
On kissing the kernel goodbye, a reasonable strategy might be to
identify the transformations that are actually occuring (like the
stores of certain constants called out above) and fix those. We do
occasionally use READ_ONCE() to prevent load-fusing optimizations that
would otherwise cause the compiler to turn while-loops into if-statements
guarding infinite loops. There is also the possibility of having the
compiler guys give us more command-line arguments.