Re: [PATCH 8/9] vfio/pci: use x86 naming instead of igd

From: Jason Gunthorpe
Date: Tue Feb 02 2021 - 18:07:12 EST


On Tue, Feb 02, 2021 at 02:30:13PM -0700, Alex Williamson wrote:

> The first set of users already fail this specification though, we can't
> base it strictly on device and vendor IDs, we need wildcards, class
> codes, revision IDs, etc., just like any other PCI drvier. We're not
> going to maintain a set of specific device IDs for the IGD
> extension,

The Intel GPU driver already has a include/drm/i915_pciids.h that
organizes all the PCI match table entries, no reason why VFIO IGD
couldn't include that too and use the same match table as the real GPU
driver. Same HW right?

Also how sure are you that this loose detection is going to work with
future Intel discrete GPUs that likely won't need vfio_igd?

> nor I suspect the NVLINK support as that would require a kernel update
> every time a new GPU is released that makes use of the same interface.

The nvlink device that required this special vfio code was a one
off. Current devices do not use it. Not having an exact PCI ID match
in this case is a bug.

> As I understand Jason's reply, these vendor drivers would have an ids
> table and a user could look at modalias for the device to compare to
> the driver supported aliases for a match. Does kmod already have this
> as a utility outside of modprobe?

I think this is worth exploring.

One idea that fits nicely with the existing infrastructure is to add
to driver core a 'device mode' string. It would be "default" or "vfio"

devices in vfio mode only match vfio mode device_drivers.

devices in vfio mode generate a unique modalias string that includes
some additional 'mode=vfio' identifier

drivers that run in vfio mode generate a module table string that
includes the same mode=vfio

The driver core can trigger driver auto loading soley based on the
mode string, happens naturally.

All the existing udev, depmod/etc tooling will transparently work.

Like driver_override, but doesn't bypass all the ID and module loading
parts of the driver core.

(But lets not get too far down this path until we can agree that
embracing the driver core like the RFC contemplates is the agreed
direction)

> Seems like it would be embedded in the aliases for the module, with
> this explicit binding flag being the significant difference that
> prevents auto loading the device. We still have one of the races that
> driver_override resolves though, the proposed explicit bind flag is on
> the driver not the device, so a native host driver being loaded due to
> a hotplug operation or independent actions of different admins could
> usurp the device between unbind of old driver and bind to new driver.

This is because the sysfs doesn't have an atomic way to bind and
rebind a device, teaching 'bind' to how to do that would also solve
this problem.

> This seems unpredictable from a user perspective. In either the igd or
> nvlink cases, if the platform features aren't available, the feature
> set of the device is reduced. That's not apparent until the user tries
> to start interacting with the device if the device specific driver
> doesn't fail the probe. Userspace policy would need to decide if a
> fallback driver is acceptable or the vendor specific driver failure is
> fatal. Thanks,

It matches today's behavior, if it is a good idea to preserve it, then
it can be so without much effort.

I do prefer the explicitness because I think most use cases have a
user that requires the special driver to run. Explicitly binding a
the required driver seems preferable.

Certainly nvlink and mlx5 should fail probe and not fall back to plain
vfio-pci. If user wants plain vfio-pci user should ask explicitly. At
least for the mlx5 cases this is a completely reasonable thing to
do. I like that we can support this choice.

I'm not so clear on IGD, especially how it would interact with future
descrete cards that probably don't need it. IMHO, it would be fine if
it was different for some good reason.

Jason