Re: [RFC 11/20] iommu/iommufd: Add IOMMU_IOASID_ALLOC/FREE
From: Jason Gunthorpe
Date: Thu Sep 23 2021 - 09:31:40 EST
On Thu, Sep 23, 2021 at 01:20:55PM +0000, Tian, Kevin wrote:
> > > this is not a flow for mdev. It's also required for pdev on Intel platform,
> > > because the pasid table is in HPA space thus must be managed by host
> > > kernel. Even no translation we still need the user to provide the pasid info.
> >
> > There should be no mandatory vPASID stuff in most of these flows, that
> > is just a special thing ENQCMD virtualization needs. If userspace
> > isn't doing ENQCMD virtualization it shouldn't need to touch this
> > stuff.
>
> No. for one, we also support SVA w/o using ENQCMD. For two, the key
> is that the PASID table cannot be delegated to the userspace like ARM
> or AMD. This implies that for any pasid that the userspace wants to
> enable, it must be configured via the kernel.
Yes, configured through the kernel, but the simplified flow should
have the kernel handle everything and just emit a PASID for userspace
to use.
> just for a short summary of PASID model from previous design RFC:
>
> for arm/amd:
> - pasid space delegated to userspace
> - pasid table delegated to userspace
> - just one call to bind pasid_table() then pasids are fully managed by user
>
> for intel:
> - pasid table is always managed by kernel
> - for pdev,
> - pasid space is delegated to userspace
> - attach_ioasid(dev, ioasid, pasid) so the kernel can setup the pasid entry
> - for mdev,
> - pasid space is managed by userspace
> - attach_ioasid(dev, ioasid, vpasid). vfio converts vpasid to ppasid. iommufd setups the ppasid entry
> - additional a contract to kvm for setup CPU pasid translation if enqcmd is used
> - to unify pdev/mdev, just always call it vpasid in attach_ioasid(). let underlying driver to figure out whether vpasid should be translated.
All cases should support a kernel owned ioas associated with a
PASID. This is the universal basic API that all PASID supporting
IOMMUs need to implement.
I should not need to write generic users space that has to know how to
setup architecture specific nested userspace page tables just to use
PASID!
All of the above is qemu accelerated vIOMMU stuff. It is a good idea
to keep the two areas seperate as it greatly informs what is general
code and what is HW specific code.
Jason