Re: [RFC PATCH 2/3] iommu/dma-iommu: Support iovad->granule > PAGE_SIZE

From: Robin Murphy
Date: Tue Aug 10 2021 - 05:52:02 EST


On 2021-08-09 21:45, Sven Peter wrote:


On Mon, Aug 9, 2021, at 19:41, Robin Murphy wrote:
On 2021-08-07 12:47, Sven Peter via iommu wrote:


On Fri, Aug 6, 2021, at 20:04, Robin Murphy wrote:
On 2021-08-06 16:55, Sven Peter via iommu wrote:
@@ -1006,6 +1019,31 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
if (dev_is_untrusted(dev))
return iommu_dma_map_sg_swiotlb(dev, sg, nents, dir, attrs);
+ /*
+ * If the IOMMU pagesize is larger than the CPU pagesize we will
+ * very likely run into sgs with a physical address that is not aligned
+ * to an IOMMU page boundary. Fall back to just mapping every entry
+ * independently with __iommu_dma_map then.

Scatterlist segments often don't have nicely aligned ends, which is why
we already align things to IOVA granules in main loop here. I think in
principle we'd just need to move the non-IOVA-aligned part of the
address from sg->page to sg->offset in the temporary transformation for
the rest of the assumptions to hold. I don't blame you for being timid
about touching that, though - it took me 3 tries to get right when I
first wrote it...



I've spent some time with that code now and I think we cannot use it
but have to fall back to iommu_dma_map_sg_swiotlb (even though that swiotlb
part is a lie then):

When we have sg_phys(s) = 0x802e65000 with s->offset = 0 the paddr
is aligned to PAGE_SIZE but has an offset of 0x1000 from something
the IOMMU can map.
Now this would result in s->offset = -0x1000 which is already weird
enough.
Offset is unsigned (and 32bit) so this will actually look like
s->offset = 0xfffff000 then, which isn't much better.
And then sg_phys(s) = 0x902e64000 (instead of 0x802e64000) and
we'll map some random memory in iommu_map_sg_atomic and a little bit later
everything explodes.

Now I could probably adjust the phys addr backwards and make sure offset is
always positive (and possibly larger than PAGE_SIZE) and later restore it
in __finalise_sg then but I feel like that's pushing this a little bit too far.

Yes, that's what I meant. At a quick guess, something like the
completely untested diff below.

That unfortunately results in unaligned mappings

You mean it even compiles!? :D

[ 9.630334] iommu: unaligned: iova 0xbff40000 pa 0x0000000801a3b000 size 0x4000 min_pagesz 0x4000

I'll take a closer look later this week and see if I can fix it.

On reflection, "s->offset ^ s_iova_off" is definitely wrong, that more likely wants to be "s->offset & ~s_iova_off".

Robin.

It really comes down to what we want to
achieve here - if it's just to make this thing work at all, then I'd
favour bolting on the absolute minimum changes, possibly even cheating
by tainting the kernel and saying all bets are off instead of trying to
handle the more involved corners really properly. However if you want to
work towards this being a properly-supported thing, then I think it's
worth generalising the existing assumptions of page alignment from the
beginning.

I'd like to try and see if we can make this a properly-supported thing.

That will likely take a few iterations but realistically the rest of the drivers
required to make this platform actually useful (and especially the display controller
and GPU drivers) won't be ready for a few more months anyway. And even on 4KB PAGE_SIZE
kernels half the USB ports and NVMe will work fine, which should be enough to install
a distro and some third-party package that just ships the distro kernel with 16KB
pages.




Sven