Re: [PATCH v2 1/2] rust: pci: skip probing VFs if driver doesn't support VFs
From: Danilo Krummrich
Date: Thu Oct 02 2025 - 13:40:25 EST
On Thu Oct 2, 2025 at 7:37 PM CEST, Danilo Krummrich wrote:
> On Thu Oct 2, 2025 at 7:05 PM CEST, Jason Gunthorpe wrote:
>> On Thu, Oct 02, 2025 at 06:05:28PM +0200, Danilo Krummrich wrote:
>>> On Thu Oct 2, 2025 at 5:23 PM CEST, Jason Gunthorpe wrote:
>>> > This is not what I've been told, the VF driver has significant
>>> > programming model differences in the NVIDIA model, and supports
>>> > different commands.
>>>
>>> Ok, that means there are some more fundamental differences between the host PF
>>> and the "VM PF" code that we have to deal with.
>>
>> That was my understanding.
>>
>>> But that doesn't necessarily require that the VF parts of the host have to be in
>>> nova-core as well, i.e. with the information we have we can differentiate
>>> between PF, VF and PF in the VM (indicated by a device register).
>>
>> I'm not entirely sure what you mean by this..
>>
>> The driver to operate the function in "vGPU" mode as indicated by the
>> register has to be in nova-core, since there is only one device ID.
>
> Yes, the PF driver on the host and the PF (from VM perspective) driver in the VM
> have to be that same. But the VF driver on the host can still be a seaparate
> one.
>
>>> > If you look at the VFIO driver RFC it basically does no mediation, it
>>> > isn't intercepting MMIO - the guest sees the BARs directly. Most of
>>> > the code is "profiling" from what I can tell. Some config space
>>> > meddling.
>>>
>>> Sure, there is no mediation in that sense, but it needs quite some setup
>>> regardless, no?
>>>
>>> I thought there is a significant amount of semantics that is different between
>>> booting the PF and the VF on the host.
>>
>> I think it would be good to have Zhi clarify more of this, but from
>> what I understand are at least three activites comingled all together:
>>
>> 1) Boot the PF in "vGPU" mode so it can enable SRIOV
>
> Ok, this might be where the confusion above comes from. When I talk about
> nova-core in vGPU mode I mean nova-core running in the VM on the (from VM
> perspective) PF.
>
> But you seem to mean nova-core running on the host PF with vGPU on top? That of
> course has to be in nova-core.
>
>> 2) Enable SRIOV and profile VFs to allocate HW resources to them
>
> I think that's partially in nova-core and partially in vGPU; nova-core providing
> the abstraction of the corresponding firmware / hardware interfaces and vGPU
> controlling the semantics of the resource handling?
>
> This is what I thought vGPU has a secondary part for where it binds to nova-core
> through the auxiliary bus, i.e. vGPU consisting out of two drivers actually; the
> VFIO parts and a "per VF resource controller".
Forgot to add: But I think Zhi explained that this is not necessary and can be
controlled by the VFIO driver, i.e. the PCI driver that binds to the VF itself.
>> 3) VFIO variant driver to convert the VF into a "VM PF" with whatever
>> mediation and enhancement needed
>
> That should be vGPU only land.