Re: [PATCH 1/1] Fix: trace sched switch start/stop racy updates

From: Paul E. McKenney
Date: Sat Aug 17 2019 - 00:52:44 EST

On Fri, Aug 16, 2019 at 03:57:43PM -0700, Linus Torvalds wrote:

[ . . . ]

> We add READ_ONCE and WRITE_ONCE annotations when they make sense. Not
> because of some theoretical "compiler is free to do garbage"
> arguments. If such garbage happens, we need to fix the compiler, the
> same way we already do with
> -fno-strict-aliasing

Yeah, the compete-with-FORTRAN stuff. :-/

There is some work going on in the C committee on this, where the
theorists would like to restrict strict-alias based optimizations to
enable better analysis tooling. And no, although the theorists are
pushing in the direction we would like them to, as far as I can see
they are not pushing as far as we would like. But it might be that
-fno-strict-aliasing needs some upgrades as well. I expect to learn
more at the next meeting in a few months.

> -fno-delete-null-pointer-checks
> -fno-strict-overflow
> because all those "optimizations" are just fundamentally unsafe and wrong.

I was hoping that -fno-strict-overflow might go into the C++ (not C)
standard, and even thought that it had at one point, but what went into
the standard was that signed integers are twos complement, not that
overflowing them is well defined.

We are pushing to hopefully end but at least to restrict the current
pointer lifetime-end zap behavior in both C and C++, which is similar
to the pointer provenance/alias analysis that -fno-strict-aliasing at
least partially suppresses. The zapping invalidates loading, storing,
comparing, or doing arithmetic on a pointer to an object whose lifetime
has ended. (The WG14 PDF linked to below gives a non-exhaustive list
of problems that this causes.)

The Linux kernel used to avoid this by refusing to tell the compiler about
kmalloc() and friends, but that changed a few years ago. This zapping
rules out a non-trivial class of concurrent algorithms, but for once
RCU is unaffected. Some committee members complained that zapping has
been part of the standard since about 1990, but Maged Michael found a
writeup of one of the algorithms dating back to 1973. ;-)

> I really wish the compiler would never take advantage of "I can prove
> this is undefined behavior" kind of things when it comes to the kernel
> (or any other projects I am involved with, for that matter). If you
> can prove that, then you shouldn't decide to generate random code
> without a big warning. But that's what those optimizations that we
> disable effectively all do.
> I'd love to have a flag that says "all undefined behavior is treated
> as implementation-defined". There's a somewhat subtle - but very
> important - difference there.

It would also be nice to downgrade some of the undefined behavior in
the standard itself. Compiler writers usually hate this because it
would require them to document what their compiler does in each case.
So they would prefer "unspecified" if the could not have "undefined".

> And that's what some hypothetical speculative write optimizations do
> too. I do not believe they are valid for the kernel. If the code says
> if (a)
> global_var = 1
> else
> global_var = 0
> then the compiler had better not turn that into
> global_var = 0
> if (a)
> global_var = 1
> even if there isn't a volatile there. But yes, we've had compiler
> writers that say "if you read the specs, that's ok".
> No, it's not ok. Because reality trumps any weasel-spec-reading.
> And happily, I don't think we've ever really seen a compiler that we
> use that actually does the above kind of speculative write thing (but
> doing it for your own local variables that can't be seen by other
> threads of execution - go wild).

Me, I would still want to use WRITE_ONCE() in this case.

> So in general, we very much expect the compiler to do sane code
> generation, and not (for example) do store tearing on normal
> word-sized things or add writes that weren't there originally etc.
> And yes, reads are different from writes. Reads don't have the same
> kind of "other threads of execution can see them" effects, so a
> compiler turning a single read into multiple reads is much more
> realistic and not the same kind of "we need to expect a certain kind
> of sanity from the compiler" issue.

Though each of those compiler-generated multiple reads might return a
different value, right?

Thanx, Paul