Re: [RFC PATCH] s390: vfio-ap: Register the vfio_ap module for the "ap" parent bus

From: Tony Krowiak
Date: Thu Jan 27 2022 - 10:04:58 EST




On 12/15/21 18:02, Halil Pasic wrote:
On Wed, 15 Dec 2021 13:51:02 +0100
Cornelia Huck <cohuck@xxxxxxxxxx> wrote:

On Wed, Dec 15 2021, Thomas Huth <thuth@xxxxxxxxxx> wrote:

On 14/12/2021 22.55, Tony Krowiak wrote:

On 12/13/21 11:11, Cornelia Huck wrote:
One possibility is simply blocking autoload of the module in userspace by
default, and only allow it to be loaded automatically when e.g. qemu-kvm
is installed on the system. This is obviously something that needs to be
decided by the distros.

(kvm might actually be autoloaded already, so autoloading vfio-ap would
not really make it worse.)
Of the vfio_ccw module is automatically loaded, then the kvm
module will also get loaded. I startup up a RHEL8.3 system and
sure enough, the vfio_ccw module is loaded along with the
kvm, vfio and mdev modules. If this is true for all distros, then
it wouldn't make much difference if the vfio_ap module is
autoloaded too.
I think I don't mind too much if we auto-load vfio-ap or not - but I think
we should make it consistent with vfio-ccw. So either auto-load both modules
(if the corresponding devices are available), or remove the
MODULE_DEVICE_TABLE() entries from both modules?
I think we really need to take a step back and think about the purpose
of adding a MODULE_DEVICE_TABLE()... basically, it declares which types
of devices on a certain bus a driver supports, in a way that can be
consumed by userspace (after file2alias.c worked on it).
I did a quick search to locate where this semantic was codified. But
I didn't find the place neither Documentation/ nor in the header where
MODULE_DEVICE_TABLE is defined.

Userspace typically uses this to match devices it is notified about to
drivers that could possibly drive those devices. In general, the
assumption is that you will want to have the drivers for your devices
loaded. In some cases (drivers only used in special cases, like here),
it might be a better idea to autoload the drivers only under certain
circumstances (e.g. if you know you're going to run KVM guests).
Does RHEL do this, or would RHEL do this out of the box? I.e.
would we end up preserving old behavior when this fix hits the distro,
or would the end user end up with kvm and vfio_ap loaded (out of the
box)?

What would be the mechanism of choice to implement if kvm loaded and
APs present/hotplugged load vfio_ap, otherwise don't in the userspace?

Sorry I'm not very familiar with this whole modules auto-loading
business, so I may be asking obvious questions. But a quick google
search did not help me.

My main point, however, is that we're talking about policy here: whether
a potentially useful driver should be loaded or not is a decision that
should be made by userspace. Not providing a MODULE_DEVICE_TABLE does
not look like the right solution, as it deprives userspace of the
information to autoload the driver, if it actually wants to do so.

I'm sympathetic to this reading of the situation, but I'm not sure
it is as black and white as stated.

I think the current state of affairs in the vfio_ap module is clearly a
bug.

One can argue that not auto-loading vfio_ap and kvm per default out of
the box is not a bug. I mean the tooling (chzdev) seems to do fine
without this and just assuming that both kvm and vfio_ap will be needed
just because we have APs seems wrong.

One of the more important guiding principles of Linux kernel development
is no userspace regressions. And I think suddenly getting vfio_ap and kvm
loaded just because we have AP devices can be thought of as a regression.

So I'm sympathetic to Harald's view as well.

Of course there is the solution that the distros should really make sure
the old behavior is preserved, or some smart behavior is introduced. But
regarding smart, I believe "if you have devices that are configured for
vfio_ap pass-through (with chzdev), then the vfio_ap module should get
loaded" is pretty much as smart as it gets. So blacklisting the module
by default in the distros looks like a viable option. If that is what
we want, we should probably ask the distros, because I don't think
it is just obviously their job to figure out that they have to do so.

Disclaimer: I might be wrong about the current behavior, I didn't verify
my claims

Also what does vfio-pci do?

From vfio_pci.c:

static const struct pci_device_id vfio_pci_table[] = {
    { PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_ANY_ID, PCI_ANY_ID) }, /* match all by default */
    {}
};

MODULE_DEVICE_TABLE(pci, vfio_pci_table);

As far as I can tell vfio-pci does not
participate in module auto loading just because there are pci devices.
The have some smart override I don't quite understand:
https://patchwork.ozlabs.org/project/linux-pci/patch/20210826103912.128972-11-yishaih@xxxxxxxxxx/
Before, I don't think they had a MODULE_DEVICE_TABLE:
https://elixir.bootlin.com/linux/v5.8.18/source/drivers/vfio/pci/vfio_pci.c

Regards,
Halil