Re: [PATCH v3 11/18] dmaengine: idxd: ims setup for the vdcm

From: Thomas Gleixner
Date: Wed Sep 30 2020 - 15:57:27 EST


On Tue, Sep 15 2020 at 16:28, Dave Jiang wrote:
> diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
> index a39392157dc2..115a8f49aab3 100644
> --- a/drivers/dma/Kconfig
> +++ b/drivers/dma/Kconfig
> @@ -301,6 +301,7 @@ config INTEL_IDXD_MDEV
> depends on INTEL_IDXD
> depends on VFIO_MDEV
> depends on VFIO_MDEV_DEVICE
> + depends on IMS_MSI_ARRAY

select?

> int idxd_mdev_host_init(struct idxd_device *idxd)
> {
> struct device *dev = &idxd->pdev->dev;

> + ims_info.max_slots = idxd->ims_size;
> + ims_info.slots = idxd->reg_base + idxd->ims_offset;
> + dev->msi_domain =
> pci_ims_array_create_msi_irq_domain(idxd->pdev, &ims_info);

1) creating the domain can fail and checking the return code is overrated

2) dev_set_msi_domain() exists for a reason. If we change any of this in
struct device then we can chase all the open coded mess in drivers
like this.

Also can you please explain how this is supposed to work?

idxd->pdev is the host PCI device. So why are you overwriting the MSI
domain of the underlying host device? This works by chance because you
allocate the regular MSIX interrupts for the host device _before_
invoking this.

IIRC, I provided you ASCII art to show how all of this is supposed to be
structured...

> int vidxd_send_interrupt(struct vdcm_idxd *vidxd, int msix_idx)
> {
> int rc = -1;
> @@ -44,15 +46,63 @@ int vidxd_send_interrupt(struct vdcm_idxd *vidxd, int msix_idx)
> return rc;
> }
>
> +#define IMS_PASID_ENABLE 0x8
> int vidxd_disable_host_ims_pasid(struct vdcm_idxd *vidxd, int ims_idx)

Yet more unreadable glue. The coding style of this stuff is horrible.

> {
> - /* PLACEHOLDER */
> + struct mdev_device *mdev = vidxd->vdev.mdev;
> + struct device *dev = mdev_dev(mdev);
> + unsigned int ims_offset;
> + struct idxd_device *idxd = vidxd->idxd;
> + u32 val;
> +
> + /*
> + * Current implementation limits to 1 WQ for the vdev and therefore
> + * also only 1 IMS interrupt for that vdev.
> + */
> + if (ims_idx >= VIDXD_MAX_WQS) {
> + dev_warn(dev, "ims_idx greater than vidxd allowed: %d\n", ims_idx);

This warning text makes no sense whatsoever.

> + return -EINVAL;
> + }
> +
> + ims_offset = idxd->ims_offset + vidxd->ims_index[ims_idx] * 0x10;
> + val = ioread32(idxd->reg_base + ims_offset + 12);
> + val &= ~IMS_PASID_ENABLE;
> + iowrite32(val, idxd->reg_base + ims_offset + 12);

*0x10 + 12 !?!?

Reusing struct ims_slot from the irq chip driver would not be convoluted
enough, right?

Aside of that this is fiddling in the IMS storage array behind the irq
chips back without any comment here and a big fat comment about the
shared usage of ims_slot::ctrl in the irq chip driver.

This is kernel programming, not the obfuscated C code contest.

> + /* Setup the PASID filtering */
> + pasid = idxd_get_mdev_pasid(mdev);
> +
> + if (pasid >= 0) {
> + ims_offset = idxd->ims_offset + vidxd->ims_index[ims_idx] * 0x10;
> + val = ioread32(idxd->reg_base + ims_offset + 12);
> + val |= IMS_PASID_ENABLE | (pasid << 12) | (val & 0x7);
> + iowrite32(val, idxd->reg_base + ims_offset + 12);

More magic numbers and more fiddling in the IMS slot.

> + } else {
> + dev_warn(dev, "pasid setup failed for ims entry %lld\n", vidxd->ims_index[ims_idx]);
> + return -ENXIO;
> + }
> +
> return 0;
> }
>
> @@ -839,12 +889,43 @@ static void vidxd_wq_disable(struct vdcm_idxd *vidxd, int wq_id_mask)
>
> void vidxd_free_ims_entries(struct vdcm_idxd *vidxd)
> {
> - /* PLACEHOLDER */
> + struct irq_domain *irq_domain;
> + struct mdev_device *mdev = vidxd->vdev.mdev;
> + struct device *dev = mdev_dev(mdev);
> + int i;
> +
> + for (i = 0; i < VIDXD_MAX_MSIX_VECS - 1; i++)
> + vidxd->ims_index[i] = -1;
> +
> + irq_domain = vidxd->idxd->pdev->dev.msi_domain;

See above.

> + msi_domain_free_irqs(irq_domain, dev);

> int vidxd_setup_ims_entries(struct vdcm_idxd *vidxd)
> {
> - /* PLACEHOLDER */
> + struct irq_domain *irq_domain;
> + struct idxd_device *idxd = vidxd->idxd;
> + struct mdev_device *mdev = vidxd->vdev.mdev;
> + struct device *dev = mdev_dev(mdev);
> + int vecs = VIDXD_MAX_MSIX_VECS - 1;
> + struct msi_desc *entry;
> + struct ims_irq_entry *irq_entry;
> + int rc, i = 0;
> +
> + irq_domain = idxd->pdev->dev.msi_domain;

Ditto.

> + rc = msi_domain_alloc_irqs(irq_domain, dev, vecs);
> + if (rc < 0)
> + return rc;
> +
> + for_each_msi_entry(entry, dev) {
> + irq_entry = &vidxd->irq_entries[i];
> + irq_entry->vidxd = vidxd;
> + irq_entry->int_src = i;

Redundant information because it's the index in the array. What for?

> + irq_entry->irq = entry->irq;
> + vidxd->ims_index[i] = entry->device_msi.hwirq;

The point of having two arrays to store related information is?

It's at least orders of magnitudes better than the previous trainwreck,
but oh well...

Thanks,

tglx