RE: [PATCH v2] mdev: Send uevents around parent device registration
From: Parav Pandit
Date: Tue Jul 02 2019 - 03:14:04 EST
> -----Original Message-----
> From: linux-kernel-owner@xxxxxxxxxxxxxxx <linux-kernel-
> owner@xxxxxxxxxxxxxxx> On Behalf Of Alex Williamson
> Sent: Tuesday, July 2, 2019 11:12 AM
> To: Kirti Wankhede <kwankhede@xxxxxxxxxx>
> Cc: cohuck@xxxxxxxxxx; kvm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH v2] mdev: Send uevents around parent device registration
>
> On Tue, 2 Jul 2019 10:25:04 +0530
> Kirti Wankhede <kwankhede@xxxxxxxxxx> wrote:
>
> > On 7/2/2019 1:34 AM, Alex Williamson wrote:
> > > On Mon, 1 Jul 2019 23:20:35 +0530
> > > Kirti Wankhede <kwankhede@xxxxxxxxxx> wrote:
> > >
> > >> On 7/1/2019 10:54 PM, Alex Williamson wrote:
> > >>> On Mon, 1 Jul 2019 22:43:10 +0530
> > >>> Kirti Wankhede <kwankhede@xxxxxxxxxx> wrote:
> > >>>
> > >>>> On 7/1/2019 8:24 PM, Alex Williamson wrote:
> > >>>>> This allows udev to trigger rules when a parent device is
> > >>>>> registered or unregistered from mdev.
> > >>>>>
> > >>>>> Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx>
> > >>>>> ---
> > >>>>>
> > >>>>> v2: Don't remove the dev_info(), Kirti requested they stay and
> > >>>>> removing them is only tangential to the goal of this change.
> > >>>>>
> > >>>>
> > >>>> Thanks.
> > >>>>
> > >>>>
> > >>>>> drivers/vfio/mdev/mdev_core.c | 8 ++++++++
> > >>>>> 1 file changed, 8 insertions(+)
> > >>>>>
> > >>>>> diff --git a/drivers/vfio/mdev/mdev_core.c
> > >>>>> b/drivers/vfio/mdev/mdev_core.c index ae23151442cb..7fb268136c62
> > >>>>> 100644
> > >>>>> --- a/drivers/vfio/mdev/mdev_core.c
> > >>>>> +++ b/drivers/vfio/mdev/mdev_core.c
> > >>>>> @@ -146,6 +146,8 @@ int mdev_register_device(struct device *dev,
> > >>>>> const struct mdev_parent_ops *ops) {
> > >>>>> int ret;
> > >>>>> struct mdev_parent *parent;
> > >>>>> + char *env_string = "MDEV_STATE=registered";
> > >>>>> + char *envp[] = { env_string, NULL };
> > >>>>>
> > >>>>> /* check for mandatory ops */
> > >>>>> if (!ops || !ops->create || !ops->remove ||
> > >>>>> !ops->supported_type_groups) @@ -197,6 +199,8 @@ int
> mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
> > >>>>> mutex_unlock(&parent_list_lock);
> > >>>>>
> > >>>>> dev_info(dev, "MDEV: Registered\n");
> > >>>>> + kobject_uevent_env(&dev->kobj, KOBJ_CHANGE, envp);
> > >>>>> +
> > >>>>> return 0;
> > >>>>>
> > >>>>> add_dev_err:
> > >>>>> @@ -220,6 +224,8 @@ EXPORT_SYMBOL(mdev_register_device);
> > >>>>> void mdev_unregister_device(struct device *dev) {
> > >>>>> struct mdev_parent *parent;
> > >>>>> + char *env_string = "MDEV_STATE=unregistered";
> > >>>>> + char *envp[] = { env_string, NULL };
> > >>>>>
> > >>>>> mutex_lock(&parent_list_lock);
> > >>>>> parent = __find_parent_device(dev); @@ -243,6 +249,8 @@
> void
> > >>>>> mdev_unregister_device(struct device *dev)
> > >>>>> up_write(&parent->unreg_sem);
> > >>>>>
> > >>>>> mdev_put_parent(parent);
> > >>>>> +
> > >>>>> + kobject_uevent_env(&dev->kobj, KOBJ_CHANGE, envp);
> > >>>>
> > >>>> mdev_put_parent() calls put_device(dev). If this is the last
> > >>>> instance holding device, then on put_device(dev) dev would get freed.
> > >>>>
> > >>>> This event should be before mdev_put_parent()
> > >>>
> > >>> So you're suggesting the vendor driver is calling
> > >>> mdev_unregister_device() without a reference to the struct device
> > >>> that it's passing to unregister? Sounds bogus to me. We take a
> > >>> reference to the device so that it can't disappear out from under
> > >>> us, the caller cannot rely on our reference and the caller
> > >>> provided the struct device. Thanks,
> > >>>
> > >>
> > >> 1. Register uevent is sent after mdev holding reference to device,
> > >> then ideally, unregister path should be mirror of register path,
> > >> send uevent and then release the reference to device.
> > >
> > > I don't see the relevance here. We're marking an event, not
> > > unwinding state of the device from the registration process.
> > > Additionally, the event we're trying to mark is the completion of
> > > each process, so the notion that we need to mirror the ordering between
> the two is invalid.
> > >
> > >> 2. I agree that vendor driver shouldn't call
> > >> mdev_unregister_device() without holding reference to device. But
> > >> to be on safer side, if ever such case occur, to avoid any
> > >> segmentation fault in kernel, better to send event before mdev release the
> reference to device.
> > >
> > > I know that get_device() and put_device() are GPL symbols and that's
> > > a bit of an issue, but I don't think we should be kludging the code
> > > for a vendor driver that might have problems with that. A) we're
> > > using the caller provided device for the uevent, B) we're only
> > > releasing our own reference to the device that was acquired during
> > > registration, the vendor driver must have other references,
> >
> > Are you going to assume that someone/vendor driver is always going to
> > do right thing?
>
> mdev is a kernel driver, we make reasonable assumptions that other drivers
> interact with it correctly.
>
That is right.
Vendor drivers must invoke mdev_register_device() and mdev_unregister_device() only once.
And it must have a valid reference to the device for which it is invoking it.
This is basic programming practice that a given driver has to follow.
mdev_register_device() has a loop to check. It needs to WARN_ON there if there are duplicate registration.
Similarly on mdev_unregister_device() to have WARN_ON if device is not found.
It was in my TODO list to submit those patches.
I was still thinking to that mdev_register_device() should return mdev_parent and mdev_unregister_device() should accept mdev_parent pointer, instead of WARN_ON on unregister().
> > > C) the parent device
> > > generally lives on a bus, with a vendor driver, there's an entire
> > > ecosystem of references to the device below mdev. Is this a
> > > paranoia request or are you really concerned that your PCI device suddenly
> > > disappears when mdev's reference to it disappears.
> >
> > mdev infrastructure is not always used by PCI devices. It is designed
> > to be generic, so that other devices (other than PCI devices) can also
> > use this framework.
>
> Obviously mdev is not PCI specific, I only mention it because I'm asking if you
> have a specific concern in mind. If you did, I'd assume it's related to a PCI
> backed vGPU. Any physical parent device of an mdev is likely to have some sort
> of bus infrastructure behind it holding references to the device (ie. a probe and
> release where an implicit reference is held between these points). A virtual
> device would be similar, it's created as part of a module init and destroyed as
> part of a module exit, where mdev registration would exist between these
> points.
>
> > If there is a assumption that user of mdev framework or vendor drivers
> > are always going to use mdev in right way, then there is no need for
> > mdev core to held reference of the device?
> > This is not a "paranoia request". This is more of a ideal scenario,
> > mdev should use device by holding its reference rather than assuming
> > (or relying on) someone else holding the reference of device.
>
> In fact, at one point Parav was proposing removing these references entirely,
> but Connie and I both felt uncomfortable about that. I think it's good practice
> that mdev indicates the use of the parent device by incrementing the reference
> count, with each child mdev device also taking a reference, but those
> references balance out within the mdev core. Their purpose is not to maintain
> the device for outside callers, nor should outside callers assume mdev's use of
> references to release their own. I don't think it's unreasonable to assume that
> the caller should have a legitimate reference to the object it's providing to this
> function and therefore we should be able to use it after mdev's internal
> references are balanced out. Thanks,
>
Yes, I also agree with Alex comment here to hold and release reference to mdev's parent device during reg/unreg routines.
> Alex