Re: [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()

From: Steven Rostedt
Date: Wed Oct 04 2017 - 10:44:02 EST

On Wed, 4 Oct 2017 07:17:45 -0700
"Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:

> > I'm more worried about other architectures that don't have as strong of
> > a cache coherency.
> >
> > [ Added Paul as he knows a lot about odd architectures ]
> >
> > Is there any architecture that we support that can have the following:
> >
> > CPU0 CPU1
> > ---- ----
> > early_printk_cpu = 1
> > for (;;)
> > old = READ_ONCE(early_printk_cpu);
> > [ old = 1 ]
> >
> > early_printk_cpu = -1
> >
> > [...]
> > cpu_relax();
> > old = READ_ONCE(early_printk_cpu);
> >
> > [ but the CPU uses the cache and not the memory? ]
> >
> > old = 1;
> If you use READ_ONCE(), then all architectures I know of enforce
> full ordering for accesses to a single variable. (If you don't use
> READ_ONCE(), then in theory Itanium can reorder reads.) Me, I would
> argue for WRITE_ONCE() as well to prevent store tearing.
> It is only when you have at least two variables and at least two threads
> than things start getting really "interesting". ;-)

My question is not about ordering, but about coherency. Can you have
one CPU read a variable that goes into cache, and keep using the cached
variable every time the program asks to read it, instead of going out
to memory.

Also, on the other CPU, if a variable is written, and the cache is not
write-through, could that variable be sitting in cache and not go out
to memory until a flush happens?

Do we support architectures that don't have strong coherency to know
that one CPU is asking for a memory location for something that was
changed in the cache of another CPU. Or a CPU will return old cache data
even though the memory was updated?

I guess my concern is that READ_ONCE() and cpu_relax(), don't actually
do a memory barrier. They are mostly compiler barriers. Do we support
poorly coherent architectures that require some kind of flush to make
sure the communication exists between the two CPUs?

Note, at TimeSys, I had to port Linux to a poorly coherent SMP board
that would trip over this all the time. I don't know if that board is
sufficient to run Linux, as we had to slap in memory barriers all over
the place. But this was a 2.4 kernel at the time.

-- Steve