RE: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
From: Tian, Kevin
Date: Tue Apr 07 2020 - 00:26:47 EST
> From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> Sent: Saturday, April 4, 2020 1:26 AM
[...]
> > > > + if (!pasid_cap.control_reg.paside) {
> > > > + pr_debug("%s: its PF's PASID capability is not enabled\n",
> > > > + dev_name(&vdev->pdev->dev));
> > > > + ret = 0;
> > > > + goto out;
> > > > + }
> > >
> > > What happens if the PF's PASID gets disabled while we're using it??
> >
> > This is actually the open I highlighted in cover letter. Per the reply
> > from Baolu, this seems to be an open for bare-metal all the same.
> > https://lkml.org/lkml/2020/3/31/95
>
> Seems that needs to get sorted out before we can expose this. Maybe
> some sort of registration with the PF driver that PASID is being used
> by a VF so it cannot be disabled?
I guess we may do vSVA for PF first, and then adding VF vSVA later
given above additional need. It's not necessarily to enable both
in one step.
[...]
> > > > @@ -1604,6 +1901,18 @@ static int vfio_ecap_init(struct
> vfio_pci_device *vdev)
> > > > if (!ecaps)
> > > > *(u32 *)&vdev->vconfig[PCI_CFG_SPACE_SIZE] = 0;
> > > >
> > > > +#ifdef CONFIG_PCI_ATS
> > > > + if (pdev->is_virtfn) {
> > > > + struct pci_dev *physfn = pdev->physfn;
> > > > +
> > > > + ret = vfio_pci_add_emulated_cap_for_vf(vdev,
> > > > + physfn, epos_max, prev);
> > > > + if (ret)
> > > > + pr_info("%s, failed to add special caps for VF %s\n",
> > > > + __func__, dev_name(&vdev->pdev->dev));
> > > > + }
> > > > +#endif
> > >
> > > I can only imagine that we should place the caps at the same location
> > > they exist on the PF, we don't know what hidden registers might be
> > > hiding in config space.
Is there vendor guarantee that hidden registers will locate at the
same offset between PF and VF config space?
> >
> > but we are not sure whether the same location is available on VF. In
> > this patch, it actually places the emulated cap physically behind the
> > cap which lays farthest (its offset is largest) within VF's config space
> > as the PCIe caps are linked in a chain.
>
> But, as we've found on Broadcom NICs (iirc), hardware developers have a
> nasty habit of hiding random registers in PCI config space, outside of
> defined capabilities. I feel like IGD might even do this too, is that
> true? So I don't think we can guarantee that just because a section of
> config space isn't part of a defined capability that its unused. It
> only means that it's unused by common code, but it might have device
> specific purposes. So of the PCIe spec indicates that VFs cannot
> include these capabilities and virtialization software needs to
> emulate them, we need somewhere safe to place them in config space, and
> simply placing them off the end of known capabilities doesn't give me
> any confidence. Also, hardware has no requirement to make compact use
> of extended config space. The first capability must be at 0x100, the
> very next capability could consume all the way to the last byte of the
> 4K extended range, and the next link in the chain could be somewhere in
> the middle. Thanks,
>
Then what would be a viable option? Vendor nasty habit implies
no standard, thus I don't see how VFIO can find a safe location
by itself. Also curious how those hidden registers are identified
by VFIO and employed with proper r/w policy today. If sort of quirks
are used, then could such quirk way be extended to also carry
the information about vendor specific safe location? When no
such quirk info is provided (the majority case), VFIO then finds
out a free location to carry the new cap.
Thanks
Kevin