Re: [PATCH v2 1/1] vfio/type1: Respect IOMMU reserved regions in vfio_test_domain_fgsp()
From: Alex Williamson
Date: Fri Jan 06 2023 - 12:26:07 EST
On Wed, 4 Jan 2023 16:42:02 +0100
Niklas Schnelle <schnelle@xxxxxxxxxxxxx> wrote:
> Since commit cbf7827bc5dc ("iommu/s390: Fix potential s390_domain
> aperture shrinking") the s390 IOMMU driver uses reserved regions for the
Are you asking for this in v6.2? Seems like the above was introduced
in v6.2 and I can't tell if this is sufficiently prevalent that we need
a fix in the same release.
> system provided DMA ranges of PCI devices. Previously it reduced the
> size of the IOMMU aperture and checked it on each mapping operation.
> On current machines the system denies use of DMA addresses below 2^32 for
> all PCI devices.
>
> Usually mapping IOVAs in a reserved regions is harmless until a DMA
> actually tries to utilize the mapping. However on s390 there is
> a virtual PCI device called ISM which is implemented in firmware and
> used for cross LPAR communication. Unlike real PCI devices this device
> does not use the hardware IOMMU but inspects IOMMU translation tables
> directly on IOTLB flush (s390 RPCIT instruction). If it detects IOVA
> mappings outside the allowed ranges it goes into an error state. This
> error state then causes the device to be unavailable to the KVM guest.
>
> Analysing this we found that vfio_test_domain_fgsp() maps 2 pages at DMA
> address 0 irrespective of the IOMMUs reserved regions. Even if usually
> harmless this seems wrong in the general case so instead go through the
> freshly updated IOVA list and try to find a range that isn't reserved,
> and fits 2 pages, is PAGE_SIZE * 2 aligned. If found use that for
> testing for fine grained super pages.
>
> Fixes: 6fe1010d6d9c ("vfio/type1: DMA unmap chunking")
Nit, the above patch pre-dates any notion of reserved regions, so isn't
this actually fixing the implementation of reserved regions in type1 to
include this test? ie.
Fixes: af029169b8fd ("vfio/type1: Check reserved region conflict and update iovalist")
> Reported-by: Matthew Rosato <mjrosato@xxxxxxxxxxxxx>
> Signed-off-by: Niklas Schnelle <schnelle@xxxxxxxxxxxxx>
> ---
> v1 -> v2:
> - Reworded commit message to hopefully explain things a bit better and
> highlight that usually just mapping but not issuing DMAs for IOVAs in
> a resverved region is harmless but still breaks things with ISM devices.
> - Added a check for PAGE_SIZE * 2 alignment (Jason)
>
> drivers/vfio/vfio_iommu_type1.c | 30 +++++++++++++++++++-----------
> 1 file changed, 19 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 23c24fe98c00..87b27ffb93d0 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1856,24 +1856,32 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu,
> * significantly boosts non-hugetlbfs mappings and doesn't seem to hurt when
> * hugetlbfs is in use.
> */
> -static void vfio_test_domain_fgsp(struct vfio_domain *domain)
> +static void vfio_test_domain_fgsp(struct vfio_domain *domain, struct list_head *regions)
> {
> - struct page *pages;
> int ret, order = get_order(PAGE_SIZE * 2);
> + struct vfio_iova *region;
> + struct page *pages;
>
> pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, order);
> if (!pages)
> return;
>
> - ret = iommu_map(domain->domain, 0, page_to_phys(pages), PAGE_SIZE * 2,
> - IOMMU_READ | IOMMU_WRITE | IOMMU_CACHE);
> - if (!ret) {
> - size_t unmapped = iommu_unmap(domain->domain, 0, PAGE_SIZE);
> + list_for_each_entry(region, regions, list) {
> + if (region->end - region->start < PAGE_SIZE * 2 ||
> + region->start % (PAGE_SIZE*2))
Maybe this falls into the noise, but we don't care if region->start is
aligned to a double page, so long as we can map an aligned double page
within the region. Maybe something like:
dma_addr_t start = ALIGN(region->start, PAGE_SIZE * 2);
if (start >= region->end || (region->end - start < PAGE_SIZE * 2))
continue;
s/region->// for below if so. Thanks,
Alex
> + continue;
>
> - if (unmapped == PAGE_SIZE)
> - iommu_unmap(domain->domain, PAGE_SIZE, PAGE_SIZE);
> - else
> - domain->fgsp = true;
> + ret = iommu_map(domain->domain, region->start, page_to_phys(pages), PAGE_SIZE * 2,
> + IOMMU_READ | IOMMU_WRITE | IOMMU_CACHE);
> + if (!ret) {
> + size_t unmapped = iommu_unmap(domain->domain, region->start, PAGE_SIZE);
> +
> + if (unmapped == PAGE_SIZE)
> + iommu_unmap(domain->domain, region->start + PAGE_SIZE, PAGE_SIZE);
> + else
> + domain->fgsp = true;
> + }
> + break;
> }
>
> __free_pages(pages, order);
> @@ -2326,7 +2334,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
> }
> }
>
> - vfio_test_domain_fgsp(domain);
> + vfio_test_domain_fgsp(domain, &iova_copy);
>
> /* replay mappings on new domains */
> ret = vfio_iommu_replay(iommu, domain);