RE: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs

From: Liu, Yi L
Date: Fri Apr 03 2020 - 07:42:25 EST


Hi Alex,

> From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> Sent: Friday, April 3, 2020 7:00 AM
> To: Liu, Yi L <yi.l.liu@xxxxxxxxx>
> Subject: Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
>
> On Sun, 22 Mar 2020 05:33:14 -0700
> "Liu, Yi L" <yi.l.liu@xxxxxxxxx> wrote:
>
> > From: Liu Yi L <yi.l.liu@xxxxxxxxx>
> >
> > Per PCIe r5.0, sec 9.3.7.14, if a PF implements the PASID Capability, the
> > PF PASID configuration is shared by its VFs. VFs must not implement their
> > own PASID Capability.
> >
> > Per PCIe r5.0, sec 9.3.7.11, VFs must not implement the PRI Capability. If
> > the PF implements PRI, it is shared by the VFs.
> >
> > On bare metal, it has been fixed by below efforts.
> > to PASID/PRI are
> > https://lkml.org/lkml/2019/9/5/996
> > https://lkml.org/lkml/2019/9/5/995
> >
> > This patch adds emulated PASID/PRI capabilities for VFs when assigned to
> > VMs via vfio-pci driver. This is required for enabling vSVA on pass-through
> > VFs as VFs have no PASID/PRI capability structure in its configure space.
> >
> > Cc: Kevin Tian <kevin.tian@xxxxxxxxx>
> > CC: Jacob Pan <jacob.jun.pan@xxxxxxxxxxxxxxx>
> > Cc: Alex Williamson <alex.williamson@xxxxxxxxxx>
> > Cc: Eric Auger <eric.auger@xxxxxxxxxx>
> > Cc: Jean-Philippe Brucker <jean-philippe@xxxxxxxxxx>
> > Signed-off-by: Liu Yi L <yi.l.liu@xxxxxxxxx>
> > ---
> > drivers/vfio/pci/vfio_pci_config.c | 325
> ++++++++++++++++++++++++++++++++++++-
> > 1 file changed, 323 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
> > index 4b9af99..84b4ea0 100644
> > --- a/drivers/vfio/pci/vfio_pci_config.c
> > +++ b/drivers/vfio/pci/vfio_pci_config.c
> > @@ -1509,11 +1509,304 @@ static int vfio_cap_init(struct vfio_pci_device *vdev)
> > return 0;
> > }
> >
> > +static int vfio_fill_custom_vconfig_bytes(struct vfio_pci_device *vdev,
> > + int offset, uint8_t *data, int size)
> > +{
> > + int ret = 0, data_offset = 0;
> > +
> > + while (size) {
> > + int filled;
> > +
> > + if (size >= 4 && !(offset % 4)) {
> > + __le32 *dwordp = (__le32 *)&vdev->vconfig[offset];
> > + u32 dword;
> > +
> > + memcpy(&dword, data + data_offset, 4);
> > + *dwordp = cpu_to_le32(dword);
>
> Why wouldn't we do:
>
> *dwordp = cpu_to_le32(*(u32 *)(data + data_offset));
>
> or better yet, increment data on each iteration for:
>
> *dwordp = cpu_to_le32(*(u32 *)data);
>
> vfio_fill_vconfig_bytes() does almost this same thing, getting the data
> from config space rather than a buffer, so please figure out how to
> avoid duplicating the logic.

Got another alternative. I may use the vfio_fill_vconfig_bytes()
to fill the cap data from PF's config space into VF's vconfig.
And after that, I can further modify the data in vconfig. e.g.
zero the control reg of pasid cap. would it make more sense?

> > + filled = 4;
> > + } else if (size >= 2 && !(offset % 2)) {
> > + __le16 *wordp = (__le16 *)&vdev->vconfig[offset];
> > + u16 word;
> > +
> > + memcpy(&word, data + data_offset, 2);
> > + *wordp = cpu_to_le16(word);
> > + filled = 2;
> > + } else {
> > + u8 *byte = &vdev->vconfig[offset];
> > +
> > + memcpy(byte, data + data_offset, 1);
[...]
>
> > +
> > + memset(map + epos, vpasid_cap.id, len);
>
> See below.
>
> > + ret = vfio_fill_custom_vconfig_bytes(vdev, epos,
> > + (u8 *)&vpasid_cap, len);
> > + if (!ret) {
> > + /*
> > + * Successfully filled in PASID cap, update
> > + * the next offset in previous cap header,
> > + * and also update caller about the offset
> > + * of next cap if any.
> > + */
> > + u32 val = epos;
> > + **prevp &= cpu_to_le32(~(0xffcU << 20));
> > + **prevp |= cpu_to_le32(val << 20);
> > + *prevp = (__le32 *)&vdev->vconfig[epos];
> > + *next = epos + len;
>
> Could we make this any more complicated?

I'm not sure if adding comments addressed this comment. After adding
new cap in vconfig, it needs to update the cap.next field of prior cap.
And in case of further add other cap, this function needs to update the
prevp pointer to the address of the newly added cap.

Regards,
Yi Liu