Re: [RFC PATCH v4 07/10] vfio/pci: introduce a new irq type VFIO_IRQ_TYPE_REMAP_BAR_REGION

From: Alex Williamson
Date: Fri May 29 2020 - 17:46:01 EST


On Sun, 17 May 2020 22:52:45 -0400
Yan Zhao <yan.y.zhao@xxxxxxxxx> wrote:

> This is a virtual irq type.
> vendor driver triggers this irq when it wants to notify userspace to
> remap PCI BARs.
>
> 1. vendor driver triggers this irq and packs the target bar number in
> the ctx count. i.e. "1 << bar_number".
> if a bit is set, the corresponding bar is to be remapped.
>
> 2. userspace requery the specified PCI BAR from kernel and if flags of
> the bar regions are changed, it removes the old subregions and attaches
> subregions according to the new flags.
>
> 3. userspace notifies back to kernel by writing one to the eventfd of
> this irq.
>
> Please check the corresponding qemu implementation from the reply of this
> patch, and a sample usage in vendor driver in patch [10/10].
>
> Cc: Kevin Tian <kevin.tian@xxxxxxxxx>
> Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx>
> ---
> include/uapi/linux/vfio.h | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 2d0d85c7c4d4..55895f75d720 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -704,6 +704,17 @@ struct vfio_irq_info_cap_type {
> __u32 subtype; /* type specific */
> };
>
> +/* Bar Region Query IRQ TYPE */
> +#define VFIO_IRQ_TYPE_REMAP_BAR_REGION (1)
> +
> +/* sub-types for VFIO_IRQ_TYPE_REMAP_BAR_REGION */
> +/*
> + * This irq notifies userspace to re-query BAR region and remaps the
> + * subregions.
> + */
> +#define VFIO_IRQ_SUBTYPE_REMAP_BAR_REGION (0)

Hi Yan,

How do we do this in a way that's backwards compatible? Or maybe, how
do we perform a handshake between the vendor driver and userspace to
indicate this support? Would the vendor driver refuse to change
device_state in the migration region if the user has not enabled this
IRQ?

Everything you've described in the commit log needs to be in this
header, we can't have the usage protocol buried in a commit log. It
also seems like this is unnecessarily PCI specific. Can't the count
bitmap simply indicate the region index to re-evaluate? Maybe you were
worried about running out of bits in the ctx count? An IRQ per region
could resolve that, but maybe we could also just add another IRQ for
the next bitmap of regions. I assume that the bitmap can indicate
multiple regions to re-evaluate, but that should be documented.

Also, what sort of service requirements does this imply? Would the
vendor driver send this IRQ when the user tries to set the device_state
to _SAVING and therefore we'd require the user to accept, implement the
mapping change, and acknowledge the IRQ all while waiting for the write
to device_state to return? That implies quite a lot of asynchronous
support in the userspace driver. Thanks,

Alex

> +
> +
> /**
> * VFIO_DEVICE_SET_IRQS - _IOW(VFIO_TYPE, VFIO_BASE + 10, struct vfio_irq_set)
> *