Re: [PATCH v3 14/30] vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV

From: Cornelia Huck
Date: Thu Feb 10 2022 - 05:07:55 EST


On Mon, Feb 07 2022, Matthew Rosato <mjrosato@xxxxxxxxxxxxx> wrote:

> On 2/7/22 12:59 PM, Cornelia Huck wrote:
>> On Mon, Feb 07 2022, Matthew Rosato <mjrosato@xxxxxxxxxxxxx> wrote:
>>
>>> On 2/7/22 3:35 AM, Cornelia Huck wrote:
>>>> On Fri, Feb 04 2022, Matthew Rosato <mjrosato@xxxxxxxxxxxxx> wrote:
>>>>
>>>>> This was previously removed as unnecessary; while that was true, subsequent
>>>>> changes will make KVM an additional required component for vfio-pci-zdev.
>>>>> Let's re-introduce CONFIG_VFIO_PCI_ZDEV as now there is actually a reason
>>>>> to say 'n' for it (when not planning to CONFIG_KVM).
>>>>
>>>> Hm... can the file be split into parts that depend on KVM and parts that
>>>> don't? Does anybody ever use vfio-pci on a non-kvm s390 system?
>>>>
>>>
>>> It is possible to split out most of the prior CLP/ vfio capability work
>>> (but it would not be a totally clean split, zpci_group_cap for example
>>> would need to have an inline ifdef since it references a KVM structure)
>>> -- I suspect we'll see more of that in the future.
>>> I'm not totally sure if there's value in the information being provided
>>> today -- this CLP information was all added specifically with
>>> userspace->guest delivery in mind. And to answer your other question,
>>> I'm not directly aware of non-kvm vfio-pci usage on s390 today; but that
>>> doesn't mean there isn't any or won't be in the future of course. With
>>> this series, you could CONFIG_KVM=n + CONFIG_VFIO_PCI=y|m and you'll get
>>> the standard vfio-pci support but never any vfio-pci-zdev extension.
>>
>> Yes. Remind me again: if you do standard vfio-pci without the extensions
>> grabbing some card-specific information and making them available to the
>> guest, you get a working setup, it just always looks like a specific
>> card, right?
>>
>
> That's how QEMU treats it anyway, yes. Standard PCI aspects (e.g.
> config space) are fine, but for the s390-specific bits we end up making
> generalizations / using hard-coded values that are subsequently shared
> with the guest when it issues a CLP -- these bits are used to identify
> various s390-specific capabilities of the device (an example: based upon
> the function type, the guest could derive what format of the function
> measurement block can be used. The hard-coded value is otherwise
> effectively 'generic device' so use the basic format for this block).
>
> Basically, we are using vfio to transmit information owned by the host
> s390 PCI layer to, ultimately, the guest s390 PCI layer (modified to
> reflect what kvm+QEMU supports), so that the guest can treat the device
> the same way that the host does. Anything else in between isn't going
> to be interested in that information unless it wants to do something
> very s390-specific.

Thanks, now I remember the details here :)

>
>>>
>>> If we wanted to provide everything we can where KVM isn't strictly
>>> required, then let's look at what a split would look like:
>>>
>>> With or without KVM:
>>> zcpi_base_cap
>>> zpci_group_cap (with an inline ifdef for KVM [1])
>>> zpci_util_cap
>>> zpci_pfip_cap
>>> vfio_pci_info_zdev_add_caps
>>> vfio_pci_zdev_open (ifdef, just return when !KVM [1])
>>> vfio_pci_zdev_release (ifdef, just return when !KVM [1])
>>>
>>> KVM only:
>>> vfio_pci_zdev_feat_interp
>>> vfio_pci_zdev_feat_aif
>>> vfio_pci_zdev_feat_ioat
>>> vfio_pci_zdev_group_notifier
>>>
>>> I suppose such a split has the benefit of flexibility /
>>> future-proofing... should a non-kvm use case arrive in the future for
>>> s390 and we find we need some s390-specific handling, we're still
>>> building vfio-pci-zdev into vfio-pci by default and can just extend that.
>>>
>>> [1] In this case I would propose renaming CONFIG_VFIO_PCI_ZDEV as we
>>> would once again always be building some part of vfio-pci-zdev with
>>> vfio-pci on s390 -- maybe something like CONFIG_VFIO_PCI_ZDEV_KVM (wow
>>> that's a mouthful) and then use this setting to check "KVM" in my above
>>> split. Since this setting will imply PCI, VFIO_PCI and KVM, we can then
>>> s/CONFIG_VFIO_PCI_ZDEV/CONFIG_VFIO_PCI_ZDEV_KVM/ for the rest of the
>>> series (to continue covering cases like we build KVM but not pci, or not
>>> vfio-pci)
>>>
>>> How does that sound?
>>
>> Complex :)
>>
>> I'm not really sure whether it's worth the hassle on an odd chance that
>> we may want it for a !KVM usecase in the future (that goes beyond the
>> "base" vfio-pci support.) OTOH, it would be cleaner. I'm a bit torn on
>> this one.
>>
>
> Well, another option would be to move ahead with this patch as-is,
> except to rename s/CONFIG_VFIO_PCI_ZDEV/CONFIG_VFIO_PCI_ZDEV_KVM/ or
> something like that (and naturally tweak the title and commit message a
> bit). Basically, don't have the name imply a 1:1 relationship with all
> of vfio-pci-zdev, even if it will have that effect in practice for now.
>
> Net result with this series would be we stop building vfio-pci-zdev
> without KVM, which means we remove the zdev CLP capabilities when !KVM.
> And then if we have a !KVM usecase in the future that needs something
> non-standard for s390 (either this CLP info or more likely some other
> s390-specific tweak) we can then perform the split, perhaps just as I
> describe above. In this way we punt the need for complexity until a
> point when (if) we need it, without backing ourselves into a weird case
> where we must rename the config parameter (or live with the fact that we
> always build some part of vfio-pci-zdev even when CONFIG_VFIO_PCI_ZDEV=n)

Ok, I think I like this option the best.