Re: [PATCH v4 1/2] iommu/sva: Tighten SVA bind API with explicit flags

From: Jason Gunthorpe
Date: Thu May 13 2021 - 13:33:09 EST


On Thu, May 13, 2021 at 04:44:14PM +0000, Luck, Tony wrote:
> > For shared workqueue, it can only generate DMA request with PASID. The
> > submission is done by ENQCMDS (S for supervisor) instruction.
> >
> > If we were not to share page tables with init_mm, we need a system PASID
> > that doing the same direct mapping in IOMMU page tables.
>
> Note that for the currently envisioned kernel use cases for accelerators it
> would be OK for this system PASID to just provide either:
>
> 1) A 1:1 mapping for physical addresses. Kernel users of the accelerators
> would provide physical addresses in descriptors.
> 2) The same mapping that the kernel uses for its "1:1" map of all physical
> memory. Users would use kernel virtual addresses in that "1:1" range
> (e.g. those obtained from page_to_virt(struct page *p);)

Well, no, neither of those are OK.

The page table under the kernel PASID should behave the same way that
the kernel would operate the page table assigned to a kernel RID.

If the kernel has security off then the PASID should map to all
physical memory, just like the RID does.

If security is on then every DMA map needs to be loaded into the
PASID's io page table no different than a RID page table.

"kernel SVA" is, IMHO, not a desirable thing, it completely destroys
the kernel's DMA security model.

> If people want to use an accelerator on memory allocated by vmalloc()
> things will get more complicated. But maybe we can delay solving that
> problem until someone comes up with a real use case that needs to
> do this?

If you have a HW limitation that the device can only issue TLPs
with a PASID, even for kernel users, then I think the proper thing is
to tell the IOMMU layer than a certain 'struct device' enters
PASID-only mode and the IOMMU layer should construct an appropriate
PASID and flow the dma operations through it.

Pretending the DMA layer doesn't exist and that PASID gets a free pass
is not OK in the kernel.

Jason