Re: [PATCH v2 2/4] cxl/mem: Fix synchronization mechanism for device removal vs ioctl operations
From: Dan Williams
Date: Tue Mar 30 2021 - 12:06:41 EST
On Tue, Mar 30, 2021 at 8:47 AM Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
>
> On Tue, Mar 30, 2021 at 08:37:19AM -0700, Dan Williams wrote:
> > On Tue, Mar 30, 2021 at 4:16 AM Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
> > >
> > > On Mon, Mar 29, 2021 at 07:47:49PM -0700, Dan Williams wrote:
> > >
> > > > @@ -1155,21 +1175,12 @@ static void cxlmdev_unregister(void *_cxlmd)
> > > > struct cxl_memdev *cxlmd = _cxlmd;
> > > > struct device *dev = &cxlmd->dev;
> > > >
> > > > - percpu_ref_kill(&cxlmd->ops_active);
> > > > cdev_device_del(&cxlmd->cdev, dev);
> > > > - wait_for_completion(&cxlmd->ops_dead);
> > > > + synchronize_srcu(&cxl_memdev_srcu);
> > >
> > > This needs some kind of rcu protected pointer for SRCU to to
> > > work.. The write side has to null the pointer and the read side has to
> > > copy the pointer to the stack and check for NULL.
> > >
> > > Otherwise the read side can't detect when the write side is shutting
> > > down.
> > >
> > > Basically you must use rcu_derference(), rcu_assign_pointer(), etc
> > > when working with RCU.
> >
> > If the shutdown path was not using synchronize_rcu() then I would
> > agree with you. This usage of srcu is also used to protect dax device
> > shutdown after talking through rwsem vs srcu in this thread with Jan
> > and Paul [1]. The syncrhonize_rcu() guarantees that all read-side
> > critical sections have had at least one chance to quiesce.
> >
> > So this could either use rcu pointer accessors and call_srcu to free
> > the object in a quiescent state, or it can use synchronize_srcu()
> > relative to a condition that aborts usage of the pointer.
>
> synchronize_rcu doesn't stop the read side from running it. It only
> guarentees that all running or future read sides will see the *write*
> performed prior to synchronize_rcu.
>
> If you can't clearly point to the *data* under RCU protection it is
> being used wrong.
Agree.
The data being protected is the value of dev->kobj.state_in_sysfs. The
read-side is allowed to keep running, and the syncrhonize_rcu()
guarantees that any read-side that saw state_in_sysfs == 1 has had a
chance to complete. Future reads terminate the ioctl at the
device_is_registered() check.
> Same as if you can't point to the *data* being protected by a rwsem it
> is probably being used wrong.
>
> We are not locking code.
Agree.