Re: [PATCH] random: use correct memory barriers for crng_node_pool

From: Eric Biggers
Date: Thu Sep 17 2020 - 12:59:32 EST


On Thu, Sep 17, 2020 at 05:26:44PM +1000, Herbert Xu wrote:
> Eric Biggers <ebiggers@xxxxxxxxxx> wrote:
> > From: Eric Biggers <ebiggers@xxxxxxxxxx>
> >
> > When a CPU selects which CRNG to use, it accesses crng_node_pool without
> > a memory barrier. That's wrong, because crng_node_pool can be set by
> > another CPU concurrently. Without a memory barrier, the crng_state that
> > is used might not appear to be fully initialized.
>
> The only architecture that requires a barrier for data dependency
> is Alpha. The correct primitive to ensure that barrier is present
> is smp_barrier_depends, or you could just use READ_ONCE.
>

smp_load_acquire() is obviously correct, whereas READ_ONCE() is an optimization
that is difficult to tell whether it's correct or not. For trivial data
structures it's "easy" to tell. But whenever there is a->b where b is an
internal implementation detail of another kernel subsystem, the use of which
could involve accesses to global or static data (for example, spin_lock()
accessing lockdep stuff), a control dependency can slip in.

The last time I tried to use READ_ONCE(), it started a big controversy
(https://lkml.kernel.org/linux-fsdevel/20200713033330.205104-1-ebiggers@xxxxxxxxxx/T/#u,
https://lkml.kernel.org/linux-fsdevel/20200717044427.68747-1-ebiggers@xxxxxxxxxx/T/#u,
https://lwn.net/Articles/827180/). In the end, people refused to even allow the
READ_ONCE() optimization to be documented, because they felt that
smp_load_acquire() should just be used instead.

So I think we should just go with smp_load_acquire()...

- Eric