Re: [PATCH v1 14/14] iommu/arm-smmu-v3: Add arm_smmu_cache_invalidate_user

From: Jason Gunthorpe
Date: Fri Mar 10 2023 - 11:24:28 EST


On Thu, Mar 09, 2023 at 08:20:03PM -0800, Nicolin Chen wrote:
> On Thu, Mar 09, 2023 at 11:31:04AM -0400, Jason Gunthorpe wrote:
> > On Thu, Mar 09, 2023 at 02:49:14PM +0000, Robin Murphy wrote:
> >
> > > If the design here is that user_data is so deeply driver-specific and
> > > special to the point that it can't possibly be passed as a type-checked
> > > union of the known and publicly-visible UAPI types that it is, wouldn't it
> > > make sense to just encode the whole thing in the expected format and not
> > > have to make these kinds of niggling little conversions at both ends?
> >
> > Yes, I suspect the design for ARM should have the input be the entire
> > actual command work queue entry. There is no reason to burn CPU cycles
> > in userspace marshalling it to something else and then decode it again
> > in the kernel. Organize things to point the ioctl directly at the
> > queue entry, and the kernel can do a single memcpy from guest
> > controlled pages to kernel memory then parse it?
>
> There still can be complications to do something straightforward
> like that.

> Firstly, the consumer and producer indexes might need
> to be synced between the host and kernel?

No, qemu would handles this. The kernel would just read the command
entries it is told by qemu to read which qemu has already sorted out.

> Secondly, things like SID and VMID fields in the commands need to
> be replaced manually when the host kernel reads commands out, which
> means that there need to be a translation table(s) in the host
> kernel to replace those fields. These actually are parts of the
> features of VCMDQ hardware itself.

VMID should be ignored in a guest request.

SID translation is a good point. Can qemu do this? How does SID
translation work with VCMDQ in HW? (Jean this is exactly the sort of
tiny detail that the generic interface ignored)

What I'm broadly thinking is if we have to make the infrastructure for
VCMDQ HW accelerated invalidation then it is not a big step to also
have the kernel SW path use the same infrastructure just with a CPU
wake up instead of a MMIO poke.

Ie we have a SW version of VCMDQ to speed up SMMUv3 cases without HW
support.

I suspect the answer to Robin's question on how to handle errors is
the most important deciding factor. If we have to capture and relay
actual HW errors back to userspace that really suggests we should do
something different than a synchronous ioctl.

Jason