On Wed, May 26, 2021 at 05:22:22PM -0700, Dave Jiang wrote:
On 5/23/2021 4:50 PM, Jason Gunthorpe wrote:I suppose it depends entirely on how the HW works.
On Fri, May 21, 2021 at 05:20:37PM -0700, Dave Jiang wrote:Hi Jason, thank you for the review of the series.
@@ -77,8 +80,18 @@ int idxd_mdev_host_init(struct idxd_device *idxd, struct mdev_driver *drv)I'm quite surprised that every mdev doesn't create its own ims_domain
return rc;
}
+ ims_info.max_slots = idxd->ims_size;
+ ims_info.slots = idxd->reg_base + idxd->ims_offset;
+ idxd->ims_domain = pci_ims_array_create_msi_irq_domain(idxd->pdev, &ims_info);
+ if (!idxd->ims_domain) {
+ dev_warn(dev, "Fail to acquire IMS domain\n");
+ iommu_dev_disable_feature(dev, IOMMU_DEV_FEAT_AUX);
+ return -ENODEV;
+ }
in its probe function.
This places a global total limit on the # of vectors which makes me
ask what was the point of using IMS in the first place ?
The entire idea for IMS was to make the whole allocation system fully
dynamic based on demand.
My understanding is that the driver creates a single IMS domain for the
device and provides the address base and IMS numbers for the domain based on
device IMS resources. So the IMS region needs to be contiguous. Each mdev
can call msi_domain_alloc_irqs() and acquire the number of IMS vectors it
desires and the DEV MSI core code will keep track of which vectors are being
used. This allows the mdev devices to dynamically allocate based on demand.
If the driver allocates a domain per mdev, it'll needs to do internal
accounting of the base and vector numbers for each of those domains that the
MSI core already provides. Isn't that what we are trying to avoid? As mdevs
come and go, that partitioning will become fragmented.
If the HW has a fixed number of interrupt vectors organized in a
single table then by all means allocate a single domain that spans the
entire fixed HW vector space. But then why do we have a ims_size
variable here??
However, that really begs the question of why the HW is using IMS at
all? I'd expect needing 2x-10x the max MSI-X vector size before
reaching for IMS.
So does IDXD really have like a 4k - 40k entry linear IMS vector table
to wrap a shared domain around?
Basically, that isn't really "scalable" it is just "bigger".
Fully scalable would be for every mdev to point to its own 2k entry
IMS table that is allocated on the fly. Every mdev gets a domain and
every domain is fully utilized by the mdev in emulating
MSI-X. Basically for a device like idxd every PASID would have to map
to a IMS vector table array.
I suppose that was not what was done?
Jason