Re: [PATCH v8 04/20] dlb: add device ioctl layer and first three ioctls

From: Dan Williams
Date: Sat Jan 09 2021 - 16:52:39 EST


On Sat, Jan 9, 2021 at 12:34 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Sat, Jan 09, 2021 at 07:49:24AM +0000, Chen, Mike Ximing wrote:
> > > > +static int dlb_ioctl_arg_size[NUM_DLB_CMD] = {
> > > > + sizeof(struct dlb_get_device_version_args),
> > > > + sizeof(struct dlb_create_sched_domain_args),
> > > > + sizeof(struct dlb_get_num_resources_args)
> > >
> > > That list.
> > >
> > > Ugh, no. that's no way to write maintainable code that you will be able
> > > to understand in 2 years.
> > >
> >
> > dlb_ioctl() was implemented with switch-case and real function calls previously.
> > We changed to the table/list implementation during a code restructure. I will move
> > back to the old implementation.
>
> Who said to change this? And why did they say that? Please go back to
> those developers and point them at this thread so that doesn't happen
> again...
>
> > > > +{
> > > > + struct dlb *dlb;
> > > > + dlb_ioctl_fn_t fn;
> > > > + u32 cmd_nr;
> > > > + void *karg;
> > > > + int size;
> > > > + int ret;
> > > > +
> > > > + dlb = f->private_data;
> > > > +
> > > > + if (!dlb) {
> > > > + pr_err("dlb: [%s()] Invalid DLB data\n", __func__);
> > > > + return -EFAULT;
> > >
> > > This error value is only allowed if you really do have a memory fault.
> > >
> > > Hint, you do not here.
> > >
> > > How can that value ever be NULL?
> > >
> >
> > It is targeted at some very rare cases, in which an ioctl command are called immediately after a device unbind (by a different process/application).
>
> And how can that happen? If it really can happen, where is the lock
> that you are holding to keep that pointer from becoming "stale" right
> after you assign it?
>
> So either this never can happen, or your logic here for this type of
> thing is totally wrong. Please work to determine which it is.

I would have preferred a chance to offer a reviewed-by on this set
before it went out (per the required process) to validate that the
feedback on the lifetime handling was properly addressed, it wasn't,
but lets deal with this on the list now.

The race to handle is the one identified by cdev_del():

* NOTE: This guarantees that cdev device will no longer be able to be
* opened, however any cdevs already open will remain and their fops will
* still be callable even after cdev_del returns.

This means that the dlb->private_data is pointing to a live device, a
dying device, or NULL. Without revalidating to the dlb pointer under a
lock, or some other coordinated reference cout, it can transition
states underneath the running ioctl.

Greg, I'm thinking of taking a shot at a document, "botching up device
lifetimes", in the same spirit as
Documentation/process/botching-up-ioctls.rst to lay out the different
schemes for revalidating driver private data in ioctls.

It strikes me that a helper like this might address many of the common patterns:

+/**
+ * get_live_device() - increment reference count for device iff !dead
+ * @dev: device.
+ *
+ * Forward the call to get_device() if the device is still alive. If
+ * this is called with the device_lock() held then the device is
+ * guaranteed to not die until the device_lock() is dropped.
+ */
+struct device *get_live_device(struct device *dev)
+{
+ return dev && !dev->p->dead ? get_device(dev) : NULL;
+}
+EXPORT_SYMBOL_GPL(get_live_device);

Alternatively, if device_lock() is too awkward for a driver it could
use its own lock and kill_device().

...am I off base thinking that cdev_del vs fops liveness is a
widespread problem worth documenting new design patterns?