RE: [PATCH v2] vfio/cdx: add support for CDX bus
From: Gupta, Nipun
Date: Mon Apr 17 2023 - 01:13:36 EST
> -----Original Message-----
> From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> Sent: Saturday, April 15, 2023 3:06 AM
> To: Gupta, Nipun <Nipun.Gupta@xxxxxxx>
> Cc: jgg@xxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; kvm@xxxxxxxxxxxxxxx;
> masahiroy@xxxxxxxxxx; nathan@xxxxxxxxxx; ndesaulniers@xxxxxxxxxx;
> nicolas@xxxxxxxxx; git (AMD-Xilinx) <git@xxxxxxx>; Anand, Harpreet
> <harpreet.anand@xxxxxxx>; Jansen Van Vuuren, Pieter <pieter.jansen-van-
> vuuren@xxxxxxx>; Agarwal, Nikhil <nikhil.agarwal@xxxxxxx>; Simek, Michal
> <michal.simek@xxxxxxx>
> Subject: Re: [PATCH v2] vfio/cdx: add support for CDX bus
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> On Fri, 14 Apr 2023 18:04:14 +0530
> Nipun Gupta <nipun.gupta@xxxxxxx> wrote:
>
> > vfio-cdx driver enables IOCTLs for user space to query
> > MMIO regions for CDX devices and mmap them. This change
> > also adds support for reset of CDX devices.
> >
> > This change adds the VFIO CDX driver and enables the following
> > ioctls for CDX devices:
> > - VFIO_DEVICE_GET_INFO:
> > - VFIO_DEVICE_GET_REGION_INFO
> > - VFIO_DEVICE_RESET
> >
> > Signed-off-by: Nipun Gupta <nipun.gupta@xxxxxxx>
> > Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@xxxxxxx>
> > Tested-by: Nikhil Agarwal <nikhil.agarwal@xxxxxxx>
> > ---
> >
> > Changes v1->v2:
> > - Updated file2alias to support vfio_cdx
> > - removed some un-necessary checks in mmap
> > - removed vfio reset wrapper API
> > - converted complex macros to static APIs
> > - used pgprot_device and io_remap_pfn_range
> >
> > MAINTAINERS | 7 +
> > drivers/vfio/Kconfig | 1 +
> > drivers/vfio/Makefile | 1 +
> > drivers/vfio/cdx/Kconfig | 17 ++
> > drivers/vfio/cdx/Makefile | 8 +
> > drivers/vfio/cdx/vfio_cdx.c | 290 ++++++++++++++++++++++++++++
> > drivers/vfio/cdx/vfio_cdx_private.h | 31 +++
> > include/linux/cdx/cdx_bus.h | 1 -
> > include/linux/mod_devicetable.h | 6 +
> > scripts/mod/devicetable-offsets.c | 1 +
> > scripts/mod/file2alias.c | 17 +-
> > 11 files changed, 378 insertions(+), 2 deletions(-)
> > create mode 100644 drivers/vfio/cdx/Kconfig
> > create mode 100644 drivers/vfio/cdx/Makefile
> > create mode 100644 drivers/vfio/cdx/vfio_cdx.c
> > create mode 100644 drivers/vfio/cdx/vfio_cdx_private.h
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 7f74d8571ac9..c4fd42ba8f46 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -22064,6 +22064,13 @@ F: Documentation/filesystems/vfat.rst
> > F: fs/fat/
> > F: tools/testing/selftests/filesystems/fat/
> >
> > +VFIO CDX DRIVER
> > +M: Nipun Gupta <nipun.gupta@xxxxxxx>
> > +M: Nikhil Agarwal <nikhil.agarwal@xxxxxxx>
> > +L: kvm@xxxxxxxxxxxxxxx
> > +S: Maintained
> > +F: drivers/vfio/cdx/*
> > +
> > VFIO DRIVER
> > M: Alex Williamson <alex.williamson@xxxxxxxxxx>
> > L: kvm@xxxxxxxxxxxxxxx
> > diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
> > index 89e06c981e43..aba36f5be4ec 100644
> > --- a/drivers/vfio/Kconfig
> > +++ b/drivers/vfio/Kconfig
> > @@ -57,6 +57,7 @@ source "drivers/vfio/pci/Kconfig"
> > source "drivers/vfio/platform/Kconfig"
> > source "drivers/vfio/mdev/Kconfig"
> > source "drivers/vfio/fsl-mc/Kconfig"
> > +source "drivers/vfio/cdx/Kconfig"
> > endif
> >
> > source "virt/lib/Kconfig"
> > diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
> > index 70e7dcb302ef..1a27b2516612 100644
> > --- a/drivers/vfio/Makefile
> > +++ b/drivers/vfio/Makefile
> > @@ -14,3 +14,4 @@ obj-$(CONFIG_VFIO_PCI) += pci/
> > obj-$(CONFIG_VFIO_PLATFORM) += platform/
> > obj-$(CONFIG_VFIO_MDEV) += mdev/
> > obj-$(CONFIG_VFIO_FSL_MC) += fsl-mc/
> > +obj-$(CONFIG_VFIO_CDX) += cdx/
> > diff --git a/drivers/vfio/cdx/Kconfig b/drivers/vfio/cdx/Kconfig
> > new file mode 100644
> > index 000000000000..e6de0a0caa32
> > --- /dev/null
> > +++ b/drivers/vfio/cdx/Kconfig
> > @@ -0,0 +1,17 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# VFIO CDX configuration
> > +#
> > +# Copyright (C) 2022-2023, Advanced Micro Devices, Inc.
> > +#
> > +
> > +config VFIO_CDX
> > + tristate "VFIO support for CDX bus devices"
> > + depends on CDX_BUS
> > + select EVENTFD
> > + help
> > + Driver to enable VFIO support for the devices on CDX bus.
> > + This is required to make use of CDX devices present in
> > + the system using the VFIO framework.
> > +
> > + If you don't know what to do here, say N.
> > diff --git a/drivers/vfio/cdx/Makefile b/drivers/vfio/cdx/Makefile
> > new file mode 100644
> > index 000000000000..82e4ef412c0f
> > --- /dev/null
> > +++ b/drivers/vfio/cdx/Makefile
> > @@ -0,0 +1,8 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# Copyright (C) 2022-2023, Advanced Micro Devices, Inc.
> > +#
> > +
> > +obj-$(CONFIG_VFIO_CDX) += vfio-cdx.o
> > +
> > +vfio-cdx-objs := vfio_cdx.o
> > diff --git a/drivers/vfio/cdx/vfio_cdx.c b/drivers/vfio/cdx/vfio_cdx.c
> > new file mode 100644
> > index 000000000000..c19062ce7680
> > --- /dev/null
> > +++ b/drivers/vfio/cdx/vfio_cdx.c
> > @@ -0,0 +1,290 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2022-2023, Advanced Micro Devices, Inc.
> > + */
> > +
> > +#include <linux/vfio.h>
> > +#include <linux/cdx/cdx_bus.h>
> > +
> > +#include "vfio_cdx_private.h"
> > +
> > +static struct cdx_driver vfio_cdx_driver;
> > +
> > +static int vfio_cdx_init_device(struct vfio_device *core_vdev)
> > +{
> > + struct vfio_cdx_device *vdev =
> > + container_of(core_vdev, struct vfio_cdx_device, vdev);
> > + struct cdx_device *cdx_dev = to_cdx_device(core_vdev->dev);
> > +
> > + vdev->cdx_dev = cdx_dev;
> > + vdev->dev = &cdx_dev->dev;
>
> Both of these seem trivial to factor out of this patch, cdx_device is
> always available via to_cdx_device(core_vdev->dev) and the struct
> device is always available via core_vdev->dev. vdev->dev isn't even
> used anywhere yet. Both the init and release functions here could be
> dropped afaict.
Agree, these functions can be removed. Will update, test, and send in next spin.
>
> > +
> > + return 0;
> > +}
> > +
> > +static void vfio_cdx_release_device(struct vfio_device *core_vdev)
> > +{
> > + struct vfio_cdx_device *vdev =
> > + container_of(core_vdev, struct vfio_cdx_device, vdev);
> > +
> > + vdev->cdx_dev = NULL;
> > + vdev->dev = NULL;
> > +}
> > +
> > +/**
> > + * CDX_DRIVER_OVERRIDE_DEVICE_VFIO - macro used to describe a VFIO
> > + * "driver_override" CDX device.
> > + * @vend: the 16 bit CDX Vendor ID
> > + * @dev: the 16 bit CDX Device ID
> > + *
> > + * This macro is used to create a struct cdx_device_id that matches a
> > + * specific device. driver_override will be set to
> > + * CDX_ID_F_VFIO_DRIVER_OVERRIDE.
> > + */
> > +#define CDX_DRIVER_OVERRIDE_DEVICE_VFIO(vend, dev) \
> > + CDX_DEVICE_DRIVER_OVERRIDE(vend, dev,
> CDX_ID_F_VFIO_DRIVER_OVERRIDE)
> > +
> > +static int vfio_cdx_open_device(struct vfio_device *core_vdev)
> > +{
> > + struct vfio_cdx_device *vdev =
> > + container_of(core_vdev, struct vfio_cdx_device, vdev);
> > + struct cdx_device *cdx_dev = vdev->cdx_dev;
> > + int count = cdx_dev->res_count;
> > + int i;
> > +
> > + vdev->regions = kcalloc(count, sizeof(struct vfio_cdx_region),
> > + GFP_KERNEL);
> > + if (!vdev->regions)
> > + return -ENOMEM;
> > +
> > + for (i = 0; i < count; i++) {
> > + struct resource *res = &cdx_dev->res[i];
> > +
> > + vdev->regions[i].addr = res->start;
> > + vdev->regions[i].size = resource_size(res);
> > + vdev->regions[i].type = res->flags;
> > + /*
> > + * Only regions addressed with PAGE granularity may be
> > + * MMAP'ed securely.
> > + */
> > + if (!(vdev->regions[i].addr & ~PAGE_MASK) &&
> > + !(vdev->regions[i].size & ~PAGE_MASK))
> > + vdev->regions[i].flags |=
> > + VFIO_REGION_INFO_FLAG_MMAP;
> > + vdev->regions[i].flags |= VFIO_REGION_INFO_FLAG_READ;
> > + if (!(cdx_dev->res[i].flags & IORESOURCE_READONLY))
> > + vdev->regions[i].flags |= VFIO_REGION_INFO_FLAG_WRITE;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static void vfio_cdx_regions_cleanup(struct vfio_cdx_device *vdev)
> > +{
> > + kfree(vdev->regions);
> > +}
> > +
> > +static void vfio_cdx_close_device(struct vfio_device *core_vdev)
> > +{
> > + struct vfio_cdx_device *vdev =
> > + container_of(core_vdev, struct vfio_cdx_device, vdev);
> > + int ret;
> > +
> > + vfio_cdx_regions_cleanup(vdev);
> > +
> > + /* reset the device before cleaning up the interrupts */
> > + ret = cdx_dev_reset(&vdev->cdx_dev->dev);
> > + if (WARN_ON(ret))
> > + dev_warn(core_vdev->dev,
> > + "VFIO_CDX: reset device has failed (%d)\n", ret);
>
> WARN_ON() + dev_warn(), do we need both? Curious we're not even using
> vdev->dev here ;)
WARN_ON will be removed and we will keep using core_vdev->dev as vdev->dev
will be removed :)
>
> > +}
> > +
> > +static long vfio_cdx_ioctl(struct vfio_device *core_vdev,
> > + unsigned int cmd, unsigned long arg)
> > +{
> > + struct vfio_cdx_device *vdev =
> > + container_of(core_vdev, struct vfio_cdx_device, vdev);
> > + struct cdx_device *cdx_dev = vdev->cdx_dev;
> > + unsigned long minsz;
> > +
> > + switch (cmd) {
> > + case VFIO_DEVICE_GET_INFO:
> > + {
> > + struct vfio_device_info info;
> > +
> > + minsz = offsetofend(struct vfio_device_info, num_irqs);
> > +
> > + if (copy_from_user(&info, (void __user *)arg, minsz))
> > + return -EFAULT;
> > +
> > + if (info.argsz < minsz)
> > + return -EINVAL;
> > +
> > + info.flags = VFIO_DEVICE_FLAGS_RESET;
>
> There ought to be a flag here indicating the vfio device exposed is
> vfio-cdx.
Sure, will add.
>
> > +
> > + info.num_regions = cdx_dev->res_count;
> > + info.num_irqs = 0;
> > +
> > + return copy_to_user((void __user *)arg, &info, minsz) ?
> > + -EFAULT : 0;
> > + }
> > + case VFIO_DEVICE_GET_REGION_INFO:
> > + {
> > + struct vfio_region_info info;
> > +
> > + minsz = offsetofend(struct vfio_region_info, offset);
> > +
> > + if (copy_from_user(&info, (void __user *)arg, minsz))
> > + return -EFAULT;
> > +
> > + if (info.argsz < minsz)
> > + return -EINVAL;
> > +
> > + if (info.index >= cdx_dev->res_count)
> > + return -EINVAL;
> > +
> > + /* map offset to the physical address */
> > + info.offset = vfio_cdx_index_to_offset(info.index);
> > + info.size = vdev->regions[info.index].size;
> > + info.flags = vdev->regions[info.index].flags;
> > +
> > + if (copy_to_user((void __user *)arg, &info, minsz))
> > + return -EFAULT;
> > + return 0;
> > + }
> > + case VFIO_DEVICE_RESET:
> > + {
> > + return cdx_dev_reset(&vdev->cdx_dev->dev);
> > + }
> > + default:
> > + return -ENOTTY;
> > + }
> > +}
> > +
> > +static int vfio_cdx_mmap_mmio(struct vfio_cdx_region region,
> > + struct vm_area_struct *vma)
> > +{
> > + u64 size = vma->vm_end - vma->vm_start;
> > + u64 pgoff, base;
> > +
> > + pgoff = vma->vm_pgoff &
> > + ((1U << (VFIO_CDX_OFFSET_SHIFT - PAGE_SHIFT)) - 1);
> > + base = pgoff << PAGE_SHIFT;
> > +
> > + if (region.size < PAGE_SIZE || base + size > region.size)
> > + return -EINVAL;
> > +
> > + vma->vm_pgoff = (region.addr >> PAGE_SHIFT) + pgoff;
> > + vma->vm_page_prot = pgprot_device(vma->vm_page_prot);
> > +
> > + return io_remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
> > + size, vma->vm_page_prot);
> > +}
> > +
> > +static int vfio_cdx_mmap(struct vfio_device *core_vdev,
> > + struct vm_area_struct *vma)
> > +{
> > + struct vfio_cdx_device *vdev =
> > + container_of(core_vdev, struct vfio_cdx_device, vdev);
> > + struct cdx_device *cdx_dev = vdev->cdx_dev;
> > + unsigned int index;
> > +
> > + index = vma->vm_pgoff >> (VFIO_CDX_OFFSET_SHIFT - PAGE_SHIFT);
> > +
> > + if (index >= cdx_dev->res_count)
> > + return -EINVAL;
> > +
> > + if (!(vdev->regions[index].flags & VFIO_REGION_INFO_FLAG_MMAP))
> > + return -EINVAL;
> > +
> > + if (!(vdev->regions[index].flags & VFIO_REGION_INFO_FLAG_READ) &&
> > + (vma->vm_flags & VM_READ))
> > + return -EINVAL;
> > +
> > + if (!(vdev->regions[index].flags & VFIO_REGION_INFO_FLAG_WRITE) &&
> > + (vma->vm_flags & VM_WRITE))
> > + return -EINVAL;
> > +
> > + return vfio_cdx_mmap_mmio(vdev->regions[index], vma);
> > +}
> > +
> > +static const struct vfio_device_ops vfio_cdx_ops = {
> > + .name = "vfio-cdx",
> > + .init = vfio_cdx_init_device,
> > + .release = vfio_cdx_release_device,
> > + .open_device = vfio_cdx_open_device,
> > + .close_device = vfio_cdx_close_device,
> > + .ioctl = vfio_cdx_ioctl,
> > + .mmap = vfio_cdx_mmap,
>
> Missing bind_iommufd, unbind_iommufd, and attach_ioas, which will be
> necessary for iommufd support.
Sure, will add these too.
>
> > +};
> > +
> > +static int vfio_cdx_probe(struct cdx_device *cdx_dev)
> > +{
> > + struct vfio_cdx_device *vdev = NULL;
> > + struct device *dev = &cdx_dev->dev;
> > + int ret;
> > +
> > + vdev = vfio_alloc_device(vfio_cdx_device, vdev, dev,
> > + &vfio_cdx_ops);
> > + if (IS_ERR(vdev))
> > + return PTR_ERR(vdev);
> > +
> > + ret = vfio_register_group_dev(&vdev->vdev);
> > + if (ret) {
> > + dev_err(dev, "VFIO_CDX: Failed to add to vfio group\n");
> > + goto out_uninit;
> > + }
> > +
> > + dev_set_drvdata(dev, vdev);
> > + return 0;
> > +
> > +out_uninit:
> > + vfio_put_device(&vdev->vdev);
> > + return ret;
> > +}
> > +
> > +static int vfio_cdx_remove(struct cdx_device *cdx_dev)
> > +{
> > + struct device *dev = &cdx_dev->dev;
> > + struct vfio_cdx_device *vdev;
> > +
> > + vdev = dev_get_drvdata(dev);
> > + vfio_unregister_group_dev(&vdev->vdev);
> > + vfio_put_device(&vdev->vdev);
> > +
> > + return 0;
> > +}
> > +
> > +static const struct cdx_device_id vfio_cdx_table[] = {
> > + { CDX_DRIVER_OVERRIDE_DEVICE_VFIO(CDX_ANY_ID, CDX_ANY_ID) }, /*
> match all by default */
> > + {}
> > +};
> > +
> > +MODULE_DEVICE_TABLE(cdx, vfio_cdx_table);
> > +
> > +static struct cdx_driver vfio_cdx_driver = {
> > + .probe = vfio_cdx_probe,
> > + .remove = vfio_cdx_remove,
> > + .match_id_table = vfio_cdx_table,
> > + .driver = {
> > + .name = "vfio-cdx",
> > + .owner = THIS_MODULE,
> > + },
> > + .driver_managed_dma = true,
> > +};
> > +
> > +static int __init vfio_cdx_driver_init(void)
> > +{
> > + return cdx_driver_register(&vfio_cdx_driver);
> > +}
> > +
> > +static void __exit vfio_cdx_driver_exit(void)
> > +{
> > + cdx_driver_unregister(&vfio_cdx_driver);
> > +}
> > +
> > +module_init(vfio_cdx_driver_init);
> > +module_exit(vfio_cdx_driver_exit);
> > +
> > +MODULE_LICENSE("GPL");
> > +MODULE_DESCRIPTION("VFIO for CDX devices - User Level meta-driver");
> > diff --git a/drivers/vfio/cdx/vfio_cdx_private.h
> b/drivers/vfio/cdx/vfio_cdx_private.h
> > new file mode 100644
> > index 000000000000..5f143736eb7a
> > --- /dev/null
> > +++ b/drivers/vfio/cdx/vfio_cdx_private.h
> > @@ -0,0 +1,31 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2022-2023, Advanced Micro Devices, Inc.
> > + */
> > +
> > +#ifndef VFIO_CDX_PRIVATE_H
> > +#define VFIO_CDX_PRIVATE_H
> > +
> > +#define VFIO_CDX_OFFSET_SHIFT 40
> > +
> > +static inline u64 vfio_cdx_index_to_offset(u32 index)
> > +{
> > + return ((u64)(index) << VFIO_CDX_OFFSET_SHIFT);
> > +}
> > +
> > +struct vfio_cdx_region {
> > + u32 flags;
> > + u32 type;
> > + u64 addr;
> > + resource_size_t size;
> > + void __iomem *ioaddr;
>
> I don't see that ioaddr is used in this patch. Future r/w support?
Agree, will remove this from this patch.
Thanks,
Nipun
> Thanks,
>
> Alex
>
> > +};
> > +
> > +struct vfio_cdx_device {
> > + struct vfio_device vdev;
> > + struct cdx_device *cdx_dev;
> > + struct device *dev;
> > + struct vfio_cdx_region *regions;
> > +};
> > +
> > +#endif /* VFIO_CDX_PRIVATE_H */
> > diff --git a/include/linux/cdx/cdx_bus.h b/include/linux/cdx/cdx_bus.h
> > index 35ef41d8a61a..bead71b7bc73 100644
> > --- a/include/linux/cdx/cdx_bus.h
> > +++ b/include/linux/cdx/cdx_bus.h
> > @@ -14,7 +14,6 @@
> > #include <linux/mod_devicetable.h>
> >
> > #define MAX_CDX_DEV_RESOURCES 4
> > -#define CDX_ANY_ID (0xFFFF)
> > #define CDX_CONTROLLER_ID_SHIFT 4
> > #define CDX_BUS_NUM_MASK 0xF
> >
> > diff --git a/include/linux/mod_devicetable.h
> b/include/linux/mod_devicetable.h
> > index ccaaeda792c0..ccf017353bb6 100644
> > --- a/include/linux/mod_devicetable.h
> > +++ b/include/linux/mod_devicetable.h
> > @@ -912,6 +912,12 @@ struct ishtp_device_id {
> > kernel_ulong_t driver_data;
> > };
> >
> > +#define CDX_ANY_ID (0xFFFF)
> > +
> > +enum {
> > + CDX_ID_F_VFIO_DRIVER_OVERRIDE = 1,
> > +};
> > +
> > /**
> > * struct cdx_device_id - CDX device identifier
> > * @vendor: Vendor ID
> > diff --git a/scripts/mod/devicetable-offsets.c b/scripts/mod/devicetable-
> offsets.c
> > index 62dc988df84d..abe65f8968dd 100644
> > --- a/scripts/mod/devicetable-offsets.c
> > +++ b/scripts/mod/devicetable-offsets.c
> > @@ -265,6 +265,7 @@ int main(void)
> > DEVID(cdx_device_id);
> > DEVID_FIELD(cdx_device_id, vendor);
> > DEVID_FIELD(cdx_device_id, device);
> > + DEVID_FIELD(cdx_device_id, override_only);
> >
> > return 0;
> > }
> > diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c
> > index 28da34ba4359..38120f932b0d 100644
> > --- a/scripts/mod/file2alias.c
> > +++ b/scripts/mod/file2alias.c
> > @@ -1458,8 +1458,23 @@ static int do_cdx_entry(const char *filename, void
> *symval,
> > {
> > DEF_FIELD(symval, cdx_device_id, vendor);
> > DEF_FIELD(symval, cdx_device_id, device);
> > + DEF_FIELD(symval, cdx_device_id, override_only);
> >
> > - sprintf(alias, "cdx:v%08Xd%08Xd", vendor, device);
> > + switch (override_only) {
> > + case 0:
> > + strcpy(alias, "cdx:");
> > + break;
> > + case CDX_ID_F_VFIO_DRIVER_OVERRIDE:
> > + strcpy(alias, "vfio_cdx:");
> > + break;
> > + default:
> > + warn("Unknown CDX driver_override alias %08X\n",
> > + override_only);
> > + return 0;
> > + }
> > +
> > + ADD(alias, "v", vendor != CDX_ANY_ID, vendor);
> > + ADD(alias, "d", device != CDX_ANY_ID, device);
> > return 1;
> > }
> >