Re: [PATCH V4 05/18] iommu/ioasid: Redefine IOASID set and allocation APIs

From: Jason Gunthorpe
Date: Tue May 11 2021 - 10:38:47 EST


On Tue, May 11, 2021 at 09:10:03AM +0000, Tian, Kevin wrote:

> 3) SRIOV, ENQCMD (Intel):
> - "PASID global" with host-allocated PASIDs;
> - PASID table managed by host (in HPA space);
> - all RIDs bound to this ioasid_fd use the global pool;
> - however, exposing global PASID into guest breaks migration;
> - hybrid scheme: split local PASID range and global PASID range;
> - force guest to use only local PASID range (through vIOMMU);
> - for ENQCMD, configure CPU to translate local->global;
> - for non-ENQCMD, setup both local/global pasid entries;
> - uAPI for range split and CPU pasid mapping:
>
> // set to "PASID global"
> ioctl(ioasid_fd, IOASID_SET_HWID_MODE, IOASID_HWID_GLOBAL);
>
> // split local/global range, applying to all RIDs in this fd
> // Example: local [0, 1024), global [1024, max)
> // local PASID range is managed by guest and migrated as VM state
> // global PASIDs are re-allocated and mapped to local PASIDs post migration
> ioctl(ioasid_fd, IOASID_HWID_SET_GLOBAL_MIN, 1024);

I'm still not sold that ranges are the best idea here, it just adds
more state that has to match during migration. Keeping the
global/local split per RID seems much cleaner to me

This is also why I don't really like having the global/local be global
to the ioasid either. It would be better to specify global/local as
part of each VFIO_ATTACH_IOASID so each device is moved to the correct
allocator.

> When considering SIOV/mdev there is no change to above uAPI sequence.
> It's n/a for 1) as SIOV requires PASID table in HPA space, nor does it
> cause any change to 3) regarding to the split range scheme. The only
> conceptual change is in 2), where although it's still "PASID per RID" the
> PASIDs must be managed by host because the parent driver also allocates
> PASIDs from per-RID space to mark mdev (RID+PASID). But this difference
> doesn't change the uAPI flow - just treat user-provisioned PASID as 'virtual'
> and then allocate a 'real' PASID at IOASID_SET_HWID. Later always use the
> real one when programming PASID entry (IOASID_BIND_PGTABLE) or device
> PASID register (converted in the mediation path).

It does need some user visible difference because SIOV/mdev is not
migratable. Only the kernel can select a PASID, userspace (and hence
the guest) shouldn't have the option to force a specific PASID as the
PASID space is shared across the entire RID to all VMs using the mdev.

I don't see any alternative to telling every part if the PASID is
going to be used by ENQCMD or not, too many important decisions rest
on this detail.

Jason