Re: [PATCH 09/11] PCI: add matching checks for driver_override binding

From: Max Gurtovoy
Date: Mon Jun 14 2021 - 19:14:44 EST



On 6/14/2021 9:42 PM, Alex Williamson wrote:
On Sun, 13 Jun 2021 11:19:46 +0300
Max Gurtovoy <mgurtovoy@xxxxxxxxxx> wrote:

On 6/9/2021 4:27 AM, Alex Williamson wrote:
On Tue, 8 Jun 2021 19:45:17 -0300
Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
On Tue, Jun 08, 2021 at 03:26:43PM -0600, Alex Williamson wrote:
drivers that specifically opt into this feature and the driver now has
the opportunity to provide a proper match table that indicates what HW
it can properly support. vfio-pci continues to support everything.
In doing so, this also breaks the new_id method for vfio-pci.
Does it? How? The driver_override flag is per match entry not for the
entire device so new_id added things will work the same as before as
their new match entry's flags will be zero.
Hmm, that might have been a testing issue; combining driverctl with
manual new_id testing might have left a driver_override in place.
Sorry, with so many userspace regressions, crippling the
driver_override interface with an assumption of such a narrow focus,
creating a vfio specific match flag, I don't see where this can go.
Thanks,
On the other hand it overcomes all the objections from the last go
round: how userspace figures out which driver to use with
driver_override and integrating the universal driver into the scheme.

pci_stub could be delt with by marking it for driver_override like
vfio_pci.
By marking it a "vfio driver override"? :-\
But driverctl as a general tool working with any module is not really
addressable.

Is the only issue the blocking of the arbitary binding? That is not a
critical peice of this, IIRC
We can't break userspace, which means new_id and driver_override need
to work as they do now. There are scads of driver binding scripts in
the wild, for vfio-pci and other drivers. We can't assume such a
narrow scope. Thanks,
what about the following code ?

@@ -152,12 +152,28 @@ static const struct pci_device_id
*pci_match_device(struct pci_driver *drv,
        }
        spin_unlock(&drv->dynids.lock);

-       if (!found_id)
-               found_id = pci_match_id(drv->id_table, dev);
+       if (found_id)
+               return found_id;
a) A dynamic ID match always works regardless of driver override...

-       /* driver_override will always match, send a dummy id */
-       if (!found_id && dev->driver_override)
+       found_id = pci_match_id(drv->id_table, dev);
+       if (found_id) {
+               /*
+                * if we found id in the static table, we must fulfill the
+                * matching flags (i.e. if PCI_ID_F_DRIVER_OVERRIDE flag is
+                * set, driver_override should be provided).
+                */
+               bool is_driver_override =
+                       (found_id->flags & PCI_ID_F_DRIVER_OVERRIDE) != 0;
+               if ((is_driver_override && !dev->driver_override) ||
b) A static ID match fails if the driver provides an override flag and
the device does not have an override set, or...

+                   (dev->driver_override && !is_driver_override))
c) The device has an override set and the driver does not support the
override flag.

+                       return NULL;
+       } else if (dev->driver_override) {
+               /*
+                * if we didn't find suitable id in the static table,
+                * driver_override will still , send a dummy id
+                */
                found_id = &pci_device_id_any;
+       }

        return found_id;
 }


dynamic ids (new_id) works as before.

Old driver_override works as before.
This is deceptively complicated, but no, I don't believe it does. By
my understanding of c) an "old" driver can no longer use
driver_override for binding a known device. It seems that if we have a
static ID match, then we cannot have a driver_override set for the
device in such a case. This is a userspace regression.

If I'll remove condition c) everyone will be happy ?

I really would like to end this ongoing discussion and finally have a clear idea of what we want.

By clear I mean C code.

If we'll continue raising ideas we'll never reach our goal. And my goal is the next merge window.


For "new" driver_override we must fulfill the new rules.
For override'able drivers, the static table is almost useless other
than using it for modules.alias support and potentially to provide
driver_data. As above, I find this all pretty confusing and I'd advise
trying to write a concise set of rules outlining the behavior of
driver_override vs dynamic IDs vs static IDs vs "override'able" driver
flags. I tried, I can't, it's convoluted and full of exceptions.
Thanks,

Alex