Re: Linux-kernel examples for LKMM recipes

From: Alan Stern
Date: Tue Oct 17 2017 - 17:03:07 EST


On Tue, 17 Oct 2017, Paul E. McKenney wrote:

> On Tue, Oct 17, 2017 at 03:38:23PM -0400, Alan Stern wrote:
> > On Tue, 17 Oct 2017, Paul E. McKenney wrote:
> >
> > > How about this?
> > >
> > > 0. Simple special cases
> > >
> > > If there is only one CPU on the one hand or only one variable
> > > on the other, the code will execute in order. There are (as
> > > usual) some things to be careful of:
> > >
> > > a. There are some aspects of the C language that are
> > > unordered. For example, the compiler can output code
> > > computing arguments of a multi-parameter function in
> > > any order it likes, or even interleaved if it so chooses.
> >
> > That parses a little oddly. I wouldn't agree that the compiler outputs
> > the code in any order it likes!
>
> When was the last time you talked to a compiler writer? ;-)
>
> > In fact, I wouldn't even mention the compiler at all. Just say that
> > (with a few exceptions) the language doesn't specify the order in which
> > the arguments of a function or operation should be evaluated. For
> > example, in the expression "f(x) + g(y)", the order in which f and g
> > are called is not defined; the object code is allowed to use either
> > order or even to interleave the computations.
>
> Nevertheless, I took your suggestion:
>
> a. There are some aspects of the C language that are
> unordered. For example, in the expression "f(x) + g(y)",
> the order in which f and g are called is not defined;
> the object code is allowed to use either order or even
> to interleave the computations.

Good.

> > > b. Compilers are permitted to use the "as-if" rule.
> > > That is, a compiler can emit whatever code it likes,
> > > as long as the results appear just as if the compiler
> > > had followed all the relevant rules. To see this,
> > > compiler with a high level of optimization and run
> > > the debugger on the resulting binary.
> >
> > You might omit the last sentence. Furthermore, if the accesses don't
> > use READ_ONCE/WRITE_ONCE then the code might not get the same result as
> > if it had executed in order (even for a single variable!), and if you
> > do use READ_ONCE/WRITE_ONCE then the compiler can't emit whatever code
> > it likes.
>
> Ah, I omitted an important qualifier:
>
> b. Compilers are permitted to use the "as-if" rule. That is,
> a compiler can emit whatever code it likes, as long as
> the results of a single-threaded execution appear just
> as if the compiler had followed all the relevant rules.
> To see this, compile with a high level of optimization
> and run the debugger on the resulting binary.

That's okay for the single-CPU case. I don't think it covers the
multiple-CPU single-variable case correctly, though. If you don't use
READ_ONCE or WRITE_ONCE, isn't the compiler allowed to tear the loads
and stores? And won't that potentially cause the end result to be
different from what you would get if the code had appeared to execute
in order?

> I have seen people (including kernel hackers) surprised by what optimizers
> do, so I would prefer that the last sentence remain.
>
> > > c. If there is only one variable but multiple CPUs, all
> > > accesses to that variable must be aligned and full sized.
> >
> > I would say that the variable is what needs to be aligned, not the
> > accesses. (Although, if the variable is aligned and all the accesses
> > are full sized, then they must necessarily be aligned as well.)
>
> I was thinking in terms of an unaligned 16-bit access to a 32-bit
> variable.

That wouldn't be full sized.

> But how about this?
>
> c. If there is only one variable but multiple CPUs, all

Extra "all". Otherwise okay.

> that variable must be properly aligned and all accesses
> to that variable must be full sized.
>
> > > Variables that straddle cachelines or pages void your
> > > full-ordering warranty, as do undersized accesses that
> > > load from or store to only part of the variable.
> >
> > How can a variable straddle pages without also straddling cache lines?
>
> Well, a variable -can- straddle cachelines without straddling pages,
> which justifies the "or". Furthermore, given that cacheline sizes have
> been growing, but pages are still 4KB, it is probably only a matter
> of time. ;-)

By that time, we'll probably be using 64-KB pages. Or even bigger!

Alan