Re: [PATCH v7 07/10] iommufd: Fault-capable hwpt attach/detach/replace

From: Jason Gunthorpe
Date: Tue Jul 09 2024 - 13:37:03 EST


On Mon, Jul 01, 2024 at 01:55:12PM +0800, Baolu Lu wrote:
> On 2024/6/29 5:17, Jason Gunthorpe wrote:
> > On Sun, Jun 16, 2024 at 02:11:52PM +0800, Lu Baolu wrote:
> > > +static int iommufd_fault_iopf_enable(struct iommufd_device *idev)
> > > +{
> > > + struct device *dev = idev->dev;
> > > + int ret;
> > > +
> > > + /*
> > > + * Once we turn on PCI/PRI support for VF, the response failure code
> > > + * should not be forwarded to the hardware due to PRI being a shared
> > > + * resource between PF and VFs. There is no coordination for this
> > > + * shared capability. This waits for a vPRI reset to recover.
> > > + */
> > > + if (dev_is_pci(dev) && to_pci_dev(dev)->is_virtfn)
> > > + return -EINVAL;
> > I don't quite get this remark, isn't not supporting PRI on VFs kind of
> > useless? What is the story here?
>
> This remark is trying to explain why attaching an iopf-capable hwpt to a
> VF is not supported for now. The PCI sepc (section 10.4.2.1) states that
> a response failure will disable the PRI on the function. But for PF/VF
> case, the PRI is a shared resource, therefore a response failure on a VF
> might cause iopf on other VFs to malfunction. So, we start from simple
> by not allowing it.

You are talking about IOMMU_PAGE_RESP_FAILURE ?

But this is bad already, something like SVA could trigger
IOMMU_PAGE_RESP_FAILURE on a VF without iommufd today. Due to memory
allocation failure in iommu_report_device_fault()

And then we pass in code from userspace and blindly cast it to
enum iommu_page_response_code ?

Probably we should just only support IOMMU_PAGE_RESP_SUCCESS/INVALID
from userspace and block FAILURE entirely. Probably the VMM should
emulate FAILURE by disabling PRI on by changing to a non PRI domain.

And this subtle uABI leak needs a fix:

iopf_group_response(group, response.code);

response.code and enum iommu_page_response_code are different
enums, and there is no range check. Need a static assert at least and
a range check. Send a followup patch please

Jason