Re: [PATCH v2 2/4] cxl/mem: Fix synchronization mechanism for device removal vs ioctl operations

From: Jason Gunthorpe
Date: Tue Mar 30 2021 - 15:52:38 EST


On Tue, Mar 30, 2021 at 12:43:15PM -0700, Dan Williams wrote:

> Ok, so this is the disagreement. Strict adherence to the model where
> it does not matter in practice.

The basic problem is that it is hard to know if it does not matter in
practice because you never know what the compiler might decide to do
...

It is probably fine, but then again I've seen a surprising case in the
mm where the compiler did generate double loads and it wasn't fine.

Now with the new data race detector it feels like marking all
concurrent data access with READ_ONCE / WRITE_ONCE (or a stronger
atomic) is the correct thing to do.

> > > There's no race above. The rule is that any possible observation of
> > > ->state_in_sysfs == 1, or rcu_dereference() != NULL, must be
> > > flushed.
> >
> > It is not about the flushing.
>
> Ok, it's not about the flushing it's about whether the store to
> state_in_sysfs can leak past up_write(). If that store can leak then
> neither arrangement of:

up_write() and synchronize_srcu() are both store barriers, so the
store must be ordered.

It is the reader that is the problem. This ordering:

> down_write(...):
> cdev_device_del(...);
> up_write(...);

Prevents state_in_sysfs from being unstable during read as the write
lock prevents it from being written while a reader is active. No
READ_ONCE needed.

Jason