Re: [PATCH] hwmon: (nzxt-kraken2) mark and order concurrent accesses

From: Jonas Malaco
Date: Mon Mar 29 2021 - 20:22:33 EST


On Mon, Mar 29, 2021 at 02:53:39PM -0700, Guenter Roeck wrote:
> On Mon, Mar 29, 2021 at 05:22:01AM -0300, Jonas Malaco wrote:
> > To avoid a spinlock, the driver explores concurrent memory accesses
> > between _raw_event and _read, having the former updating fields on a
> > data structure while the latter could be reading from them. Because
> > these are "plain" accesses, those are data races according to the Linux
> > kernel memory model (LKMM).
> >
> > Data races are undefined behavior in both C11 and LKMM. In practice,
> > the compiler is free to make optimizations assuming there is no data
> > race, including load tearing, load fusing and many others,[1] most of
> > which could result in corruption of the values reported to user-space.
> >
> > Prevent undesirable optimizations to those concurrent accesses by
> > marking them with READ_ONCE() and WRITE_ONCE(). This also removes the
> > data races, according to the LKMM, because both loads and stores to each
> > location are now "marked" accesses.
> >
> > As a special case, use smp_load_acquire() and smp_load_release() when
> > loading and storing ->updated, as it is used to track the validity of
> > the other values, and thus has to be stored after and loaded before
> > them. These imply READ_ONCE()/WRITE_ONCE() but also ensure the desired
> > order of memory accesses.
> >
> > [1] https://lwn.net/Articles/793253/
> >
>
> I think you lost me a bit there. What out-of-order accesses that would be
> triggered by a compiler optimization are you concerned about here ?
> The only "problem" I can think of is that priv->updated may have been
> written before the actual values. The impact would be ... zero. An
> attribute read would return "stale" data for a few microseconds.
> Why is that a concern, and what difference does it make ?

The impact of out-of-order accesses to priv->updated is indeed minimal.

That said, smp_load_acquire() and smp_store_release() were meant to
prevent reordering at runtime, and only affect architectures other than
x86. READ_ONCE() and WRITE_ONCE() would already prevent reordering from
compiler optimizations, and x86 provides the load-acquire/store-release
semantics by default.

But the reordering issue is not a concern to me, I got carried away when
adding READ_ONCE()/WRITE_ONCE(). While smp_load_acquire() and
smp_store_release() make the code work more like I intend it to, they
are (small) costs we can spare.

I still think that READ_ONCE()/WRITE_ONCE() are necessary, including for
priv->updated. Do you agree?

Thanks,
Jonas

P.S. Architectures other than x86 are admittedly a niche case for this
driver, but I would not rule them out. Not only can the cooler be
adapted to mount on silicon other than mainstream Intel/AMD CPUs (and
there even exists a first-party adapter for graphics cards), but it can
trivially also be controlled by a secondary, possibly non-x86, system.