Re: [PATCH v3 3/8] iommufd: Add fault and response message definitions

From: Jason Gunthorpe
Date: Fri Mar 22 2024 - 13:04:20 EST


On Thu, Mar 14, 2024 at 09:41:45PM +0800, Baolu Lu wrote:
> On 2024/3/9 1:50, Jason Gunthorpe wrote:
> > On Mon, Jan 22, 2024 at 03:38:58PM +0800, Lu Baolu wrote:
> >
> > > +/**
> > > + * enum iommu_hwpt_pgfault_flags - flags for struct iommu_hwpt_pgfault
> > > + * @IOMMU_PGFAULT_FLAGS_PASID_VALID: The pasid field of the fault data is
> > > + * valid.
> > > + * @IOMMU_PGFAULT_FLAGS_LAST_PAGE: It's the last fault of a fault group.
> > > + */
> > > +enum iommu_hwpt_pgfault_flags {
> > > + IOMMU_PGFAULT_FLAGS_PASID_VALID = (1 << 0),
> > > + IOMMU_PGFAULT_FLAGS_LAST_PAGE = (1 << 1),
> > > +};
> > > +
> > > +/**
> > > + * enum iommu_hwpt_pgfault_perm - perm bits for struct iommu_hwpt_pgfault
> > > + * @IOMMU_PGFAULT_PERM_READ: request for read permission
> > > + * @IOMMU_PGFAULT_PERM_WRITE: request for write permission
> > > + * @IOMMU_PGFAULT_PERM_EXEC: request for execute permission
> > > + * @IOMMU_PGFAULT_PERM_PRIV: request for privileged permission
> >
> > You are going to have to elaborate what PRIV is for.. We don't have
> > any concept of this in the UAPI for iommufd so what is a userspace
> > supposed to do if it hits this? EXEC is similar, we can't actually
> > enable exec permissions from userspace IIRC..
>
> The PCIe spec, section "10.4.1 Page Request Message" and "6.20.2 PASID
> Information Layout":
>
> The PCI PASID TLP Prefix defines "Execute Requested" and "Privileged
> Mode Requested" bits.
>
> PERM_EXEC indicates a page request with a PASID that has the "Execute
> Requested" bit set. Similarly, PERM_PRIV indicates a page request with a
> PASID that has "Privileged Mode Requested" bit set.

Oh, I see! OK Maybe just add a note that it follows PCIE 10.4.1

> > > +struct iommu_hwpt_pgfault {
> > > + __u32 size;
> > > + __u32 flags;
> > > + __u32 dev_id;
> > > + __u32 pasid;
> > > + __u32 grpid;
> > > + __u32 perm;
> > > + __u64 addr;
> > > +};
> >
> > Do we need an addr + size here? I've seen a few things where I wonder
> > if that might become an enhancment someday.
>
> I am not sure. The page size is not part of ATS/PRI. Can you please
> elaborate a bit about how the size could be used? Perhaps I
> misunderstood here?

size would be an advice how much data the requestor is expecting to
fetch. Eg of the PRI initiator knows it is going to do a 10MB transfer
it could fill in 10MB and the OS could pre-fault in 10MB of IOVA.

It is not in the spec, it may never be in the spec, but it seems like
it would be good to consider it, at least make sure we have
compatability to add it later.

> > > + * @addr: The fault address. Must match the addr field of the
> > > + * last iommu_hwpt_pgfault of a reported iopf group.
> > > + */
> > > +struct iommu_hwpt_page_response {
> > > + __u32 size;
> > > + __u32 flags;
> > > + __u32 dev_id;
> > > + __u32 pasid;
> > > + __u32 grpid;
> > > + __u32 code;
> > > + __u64 addr;
> > > +};
> >
> > Do we want some kind of opaque ID value from the kernel here to match
> > request with response exactly? Or is the plan to search on the addr?
>
> I am using the "addr" as the opaque data to search request in this
> series. Is it enough?

I'm not sure, the other discussion about grpid seems to be the main
question so lets see there.

Jason