Re: mechanism to allow a driver to bind to any device

From: Kim Phillips
Date: Mon Mar 31 2014 - 18:36:47 EST


On Fri, 28 Mar 2014 11:10:23 -0600
Alex Williamson <alex.williamson@xxxxxxxxxx> wrote:

> On Fri, 2014-03-28 at 12:58 -0400, Konrad Rzeszutek Wilk wrote:
> > On Wed, Mar 26, 2014 at 04:09:21PM -0600, Alex Williamson wrote:
> > > On Wed, 2014-03-26 at 10:21 -0600, Alex Williamson wrote:
> > > > On Wed, 2014-03-26 at 23:06 +0800, Alexander Graf wrote:
> > > > >
> > > > > > Am 26.03.2014 um 22:40 schrieb Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>:
> > > > > >
> > > > > >> On Wed, Mar 26, 2014 at 01:40:32AM +0000, Stuart Yoder wrote:
> > > > > >> Hi Greg,
> > > > > >>
> > > > > >> We (Linaro, Freescale, Virtual Open Systems) are trying get an issue
> > > > > >> closed that has been perculating for a while around creating a mechanism
> > > > > >> that will allow kernel drivers like vfio can bind to devices of any type.
> > > > > >>
> > > > > >> This thread with you:
> > > > > >> http://www.spinics.net/lists/kvm-arm/msg08370.html
> > > > > >> ...seems to have died out, so am trying to get your response
> > > > > >> and will summarize again. Vfio drivers in the kernel (regardless of
> > > > > >> bus type) need to bind to devices of any type. The driver's function
> > > > > >> is to simply export hardware resources of any type to user space.
> > > > > >>
> > > > > >> There are several approaches that have been proposed:
> > > > > >
> > > > > > You seem to have missed the one I proposed.
> > > > > >>
> > > > > >> 1. new_id -- (current approach) the user explicitly registers
> > > > > >> each new device type with the vfio driver using the new_id
> > > > > >> mechanism.
> > > > > >>
> > > > > >> Problem: multiple drivers will be resident that handle the
> > > > > >> same device type...and there is nothing user space hotplug
> > > > > >> infrastructure can do to help.
> > > > > >>
> > > > > >> 2. "any id" -- the vfio driver could specify a wildcard match
> > > > > >> of some kind in its ID match table which would allow it to
> > > > > >> match and bind to any possible device id. However,
> > > > > >> we don't want the vfio driver grabbing _all_ devices...just the ones we
> > > > > >> explicitly want to pass to user space.
> > > > > >>
> > > > > >> The proposed patch to support this was to create a new flag
> > > > > >> "sysfs_bind_only" in struct device_driver. When this flag
> > > > > >> is set, the driver can only bind to devices via the sysfs
> > > > > >> bind file. This would allow the wildcard match to work.
> > > > > >>
> > > > > >> Patch is here:
> > > > > >> https://lkml.org/lkml/2013/12/3/253
> > > > > >>
> > > > > >> 3. "Driver initiated explicit bind" -- with this approach the
> > > > > >> vfio driver would create a private 'bind' sysfs object
> > > > > >> and the user would echo the requested device into it:
> > > > > >>
> > > > > >> echo 0001:03:00.0 > /sys/bus/pci/drivers/vfio-pci/vfio_bind
> > > > > >>
> > > > > >> In order to make that work, the driver would need to call
> > > > > >> driver_probe_device() and thus we need this patch:
> > > > > >> https://lkml.org/lkml/2014/2/8/175
> > > > > >
> > > > > > 4). Use the 'unbind' (from the original device) and 'bind' to vfio driver.
> > > > >
> > > > > This is approach 2, no?
> > > > >
> > > > > >
> > > > > > Which I think is what is currently being done. Why is that not sufficient?
> > > > >
> > > > > How would 'bind to vfio driver' look like?
> > > > >
> > > > > > The only thing I see in the URL is " That works, but it is ugly."
> > > > > > There is some mention of race but I don't see how - if you do the 'unbind'
> > > > > > on the original driver and then bind the BDF to the VFIO how would you get
> > > > > > a race?
> > > > >
> > > > > Typically on PCI, you do a
> > > > >
> > > > > - add wildcard (pci id) match to vfio driver
> > > > > - unbind driver
> > > > > -> reprobe
> > > > > -> device attaches to vfio driver because it is the least recent match
> > > > > - remove wildcard match from vfio driver
> > > > >
> > > > > If in between you hotplug add a card of the same type, it gets attached to vfio - even though the logical "default driver" would be the device specific driver.
> > > >
> > > > I've mentioned drivers_autoprobe in the past, but I'm not sure we're
> > > > really factoring it into the discussion. drivers_autoprobe allows us to
> > > > toggle two points:
> > > >
> > > > a) When a new device is added whether we automatically give drivers a
> > > > try at binding to it
> > > >
> > > > b) When a new driver is added whether it gets to try to bind to anything
> > > > in the system
> > > >
> > > > So we do have a mechanism to avoid the race, but the problem is that it
> > > > becomes the responsibility of userspace to:
> > > >
> > > > 1) turn off drivers_autoprobe
> > > > 2) unbind/new_id/bind/remove_id
> > > > 3) turn on drivers_autoprobe
> > > > 4) call drivers_probe for anything added between 1) & 3)
> > > >
> > > > Is the question about the ugliness of the current solution whether it's
> > > > unreasonable to ask userspace to do this?
> > > > What we seem to be asking for above is more like an autoprobe flag per
> > > > driver where there's some way for this special driver to opt out of auto
> > > > probing. Option 2. in Stuart's list does this by short-cutting ID
> > > > matching so that a "match" is only found when using the sysfs bind path,
> > > > option 3. enables a way for a driver to expose their own sysfs entry
> > > > point for binding. The latter feels particularly chaotic since drivers
> > > > get to make-up their own bind mechanism.

agreed - so far, option 2 looks the most sane.

> > > > Another twist I'll throw in is that devices can be hot added to IOMMU
> > > > groups that are in-use by userspace. When that happens we'd like to be
> > > > able to disable driver autoprobe of the device to avoid a host driver
> > > > automatically binding to the device. I wonder if instead of looking at
> > > > the problem from the driver perspective, if we were to instead look at
> > > > it from the device perspective if we might find a solution that would
> > > > address both. For instance, if devices had a driver_probe_id property
> > > > that was by default set to their bus specific ID match ("$VENDOR
> > > > $DEVICE" on PCI) could we use that to write new match IDs so that a
> > > > device could only bind to a given driver? Effectively we could then
> > > > bind either using the current method of adding to the list of IDs a
> > > > driver will match of changing the ID that a device would match. Does
> > > > that get us anywhere? Thanks,

How does this compare to Scott's device->sysfs_bind_only, in addition
to option 2 above's driver->sysfs_bind_only?:

"What it looks like we do still want from the driver core is the ability
for a driver to say that it should not be bound to a device except via
explicit sysfs bind, and the ability for a user to say that a device
should not be bound to a driver except via explicit sysfs bind. This is
a separate issue from making driver_match_device() happy (in some
earlier e-mails in the thread these two issues were not properly
separated)." [1]

> > > Here's one way this might work for PCI; note that we can do this
> > > entirely in the bus driver for PCI. Bind/unbind would go like this:
> > >
> > > # bind device to vfio-pci
> > > echo vfio-pci > /sys/bus/pci/devices/0000\:03\:00.0/preferred_driver
> > > echo 0000:03:00.0 > /sys/bus/pci/devices/0000\:03\:00.0/driver/unbind
> > > echo 0000:03:00.0 > /sys/bus/pci/drivers_probe
> > >
> > > # bind device back to host driver
> > > echo > /sys/bus/pci/devices/0000\:03\:00.0/preferred_driver
> > > echo 0000:03:00.0 > /sys/bus/pci/devices/0000\:03\:00.0/driver/unbind
> > > echo 0000:03:00.0 > /sys/bus/pci/drivers_probe

With the null-write to preferred_driver, it's not crystal clear (to
me at least) what would happen in the above command sequence, given
multiple drivers may match. It seems like there'd be more control
binding in a multiple driver-match environment using
{device,driver}->sysfs_bind_only.

Kim

[1] last paragraph:
http://www.spinics.net/lists/kvm/msg96906.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/