Re: [PATCH] vfio/pci: Propagate ACPI notifications to the user-space

From: Alex Williamson
Date: Wed Mar 08 2023 - 12:50:54 EST


[Cc +libvir-list]

On Wed, 8 Mar 2023 12:41:24 +0100
Grzegorz Jaszczyk <jaz@xxxxxxxxxxxx> wrote:

> śr., 8 mar 2023 o 00:42 Alex Williamson <alex.williamson@xxxxxxxxxx> napisał(a):
> >
> > On Tue, 7 Mar 2023 22:05:53 +0000
> > Grzegorz Jaszczyk <jaz@xxxxxxxxxxxx> wrote:
> >
> > > From: Dominik Behr <dbehr@xxxxxxxxxxxx>
> > >
> > > Hitherto there was no support for propagating ACPI notifications to the
> > > guest drivers. In order to provide such support, install a handler for
> > > notifications on an ACPI device during vfio-pci device registration. The
> > > handler role is to propagate such ACPI notifications to the user-space
> > > via acpi netlink events, which allows VMM to receive and propagate them
> > > further to the VMs.
> > >
> > > Thanks to the above, the actual driver for the pass-through device,
> > > which belongs to the guest, can receive and react to device specific
> > > notifications.
>
> > What consumes these events?
>
> Those events are consumed by the VMM, which can have a built-in ACPI
> event listener.
>
> > Has this been proposed to any VM management tools like libvirt?
>
> This patch was evaluated and tested with crosvm VMM (but since the
> kernel part is not in the tree the implementation is marked as WIP).

Adding libvirt folks. This intentionally designs the interface in a
way that requires a privileged intermediary to monitor netlink on the
host, associate messages to VMs based on an attached device, and
re-inject the event to the VMM. Why wouldn't we use a channel
associated with the device for such events, such that the VMM has
direct access? The netlink path seems like it has more moving pieces,
possibly scalability issues, and maybe security issues?

> > What sort of ACPI events are we expecting to see here and what does user space do with them?
>
> With this patch we are expecting to see and propagate any device
> specific notifications, which are aimed to notify the proper device
> (driver) which belongs to the guest.
>
> Here is the description how propagating such notification could be
> implemented by VMM:
>
> 1) VMM could upfront generate proper virtual ACPI description for
> guest per vfio-pci device (more precisely it could be e.g. ACPI GPE
> handler, which aim is only to notify relevant device):

The proposed interface really has no introspection, how does the VMM
know which devices need ACPI tables added "upfront"? How do these
events factor into hotplug device support, where we may not be able to
dynamically inject ACPI code into the VM?

>
> Scope (_GPE)
> {
> Method (_E00, 0, NotSerialized) // _Exx: Edge-Triggered
> GPE, xx=0x00-0xFF
> {
> Local0 = \_SB.PC00.PE08.NOTY
> Notify (\_SB.PC00.PE08, Local0)
> }
> }
>
> 2) Now, when the VMM receives ACPI netlink event (thanks to VMM
> builtin ACPI event listener, which is able to receive any event
> generated through acpi_bus_generate_netlink_event) VMM classifies it
> based on device_class ("vfio_pci" in this case) and parses it further
> to get device name and the notification value for it. This
> notification value is stored in a virtual register and VMM triggers
> GPE associated with the pci-vfio device.

Each VMM is listening for netlink events and sees all the netlink
traffic from the host, including events destined for other VMMs? This
doesn't seem terribly acceptable from a security perspective.

> 3) Guest kernel upon handling GPE, thanks to generated AML (ad 1.),
> triggers Notify on required pass-through device and therefore
> replicates the ACPI Notification on the guest side (Accessing
> \_SB.PC00.PE08.NOTY from above example, result with trap to VMM, which
> returns previously stored notify value).

The acpi_bus_generate_netlink_event() below really only seems to form a
u8 event type from the u32 event. Is this something that could be
provided directly from the vfio device uAPI with an ioeventfd, thus
providing introspection that a device supports ACPI event notifications
and the ability for the VMM to exclusively monitor those events, and
only those events for the device, without additional privileges?
Thanks,

Alex

> With above the ACPI notifications are actually replicated on the guest
> side and from a guest driver perspective they don't differ from native
> ones.
>
> >
> > > Signed-off-by: Dominik Behr <dbehr@xxxxxxxxxxxx>
> > > Co-developed-by: Grzegorz Jaszczyk <jaz@xxxxxxxxxxxx>
> > > Signed-off-by: Grzegorz Jaszczyk <jaz@xxxxxxxxxxxx>
> > > ---
> > > drivers/vfio/pci/vfio_pci_core.c | 33 ++++++++++++++++++++++++++++++++
> > > 1 file changed, 33 insertions(+)
> > >
> > > diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> > > index a5ab416cf476..92b8ed8d087c 100644
> > > --- a/drivers/vfio/pci/vfio_pci_core.c
> > > +++ b/drivers/vfio/pci/vfio_pci_core.c
> > > @@ -10,6 +10,7 @@
> > >
> > > #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > >
> > > +#include <linux/acpi.h>
> > > #include <linux/aperture.h>
> > > #include <linux/device.h>
> > > #include <linux/eventfd.h>
> > > @@ -2120,10 +2121,20 @@ void vfio_pci_core_release_dev(struct vfio_device *core_vdev)
> > > }
> > > EXPORT_SYMBOL_GPL(vfio_pci_core_release_dev);
> > >
> > > +static void vfio_pci_core_acpi_notify(acpi_handle handle, u32 event, void *data)
> > > +{
> > > + struct vfio_pci_core_device *vdev = (struct vfio_pci_core_device *)data;
> > > + struct device *dev = &vdev->pdev->dev;
> > > +
> > > + acpi_bus_generate_netlink_event("vfio_pci", dev_name(dev), event, 0);
> >
> > Who listens to this? Should there be an in-band means to provide
> > notifies related to the device? How does a userspace driver know to
> > look for netlink events for a particular device?
>
> VMM which has implemented logic responsible for listening on acpi
> netlink events. This netlink message already passes the device name so
> VMM will associate it with a particular device. I've elaborated a bit
> more in my previous answer.
>
> >
> > > +}
> > > +
> > > int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev)
> > > {
> > > + acpi_status status;
> > > struct pci_dev *pdev = vdev->pdev;
> > > struct device *dev = &pdev->dev;
> > > + struct acpi_device *adev = ACPI_COMPANION(&pdev->dev);
> > > int ret;
> > >
> > > /* Drivers must set the vfio_pci_core_device to their drvdata */
> > > @@ -2201,8 +2212,24 @@ int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev)
> > > ret = vfio_register_group_dev(&vdev->vdev);
> > > if (ret)
> > > goto out_power;
> > > +
> > > + if (!adev) {
> > > + pci_info(pdev, "No ACPI companion");
> >
> > This would be a log message generated for 99.99% of devices.
>
> Sure - I will remove that.
>
> >
> > > + return 0;
> > > + }
> > > +
> > > + status = acpi_install_notify_handler(adev->handle, ACPI_DEVICE_NOTIFY,
> > > + vfio_pci_core_acpi_notify, (void *)vdev);
> >
> > vfio-pci supports non-ACPI platforms, I don't see any !CONFIG_ACPI
> > prototypes for this function. Thanks,
>
> Good point, I will address this in the next version.
>
> Thank you,
> Grzegorz
>
> >
> > Alex
> >
> > > +
> > > + if (ACPI_FAILURE(status)) {
> > > + pci_err(pdev, "Failed to install notify handler");
> > > + goto out_group_register;
> > > + }
> > > +
> > > return 0;
> > >
> > > +out_group_register:
> > > + vfio_unregister_group_dev(&vdev->vdev);
> > > out_power:
> > > if (!disable_idle_d3)
> > > pm_runtime_get_noresume(dev);
> > > @@ -2216,6 +2243,12 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_register_device);
> > >
> > > void vfio_pci_core_unregister_device(struct vfio_pci_core_device *vdev)
> > > {
> > > + struct acpi_device *adev = ACPI_COMPANION(&vdev->pdev->dev);
> > > +
> > > + if (adev)
> > > + acpi_remove_notify_handler(adev->handle, ACPI_DEVICE_NOTIFY,
> > > + vfio_pci_core_acpi_notify);
> > > +
> > > vfio_pci_core_sriov_configure(vdev, 0);
> > >
> > > vfio_unregister_group_dev(&vdev->vdev);
> >
>