Re: [RFC v1 1/5] PCI: hv: Create and export hv_build_logical_dev_id()
From: Easwar Hariharan
Date: Tue Apr 14 2026 - 15:44:16 EST
On 4/14/2026 11:06 AM, Michael Kelley wrote:
> From: Easwar Hariharan <easwar.hariharan@xxxxxxxxxxxxxxxxxxx> Sent: Tuesday, April 14, 2026 10:42 AM
>>
>
> [snip]
>
>>>> Thanks for that explanation, that makes sense. I didn't see any serialization
>>>> that would ensure that the VMBus path to communicate the child devices on the bus
>>>> would complete before pci_scan_device() finds and finalizes the pci_dev. I think it's
>>>
>>> FWIW, hv_pci_query_relations() should be ensuring that the communication
>>> has completed before it returns. It does a wait_for_reponse(), which ensures
>>> that the Hyper-V host has sent the PCI_BUS_RELATIONS[2] response. However,
>>> that message spins off work to the hbus->wq workqueue, so
>>> hv_pci_query_relations() has a flush_workqueue() so ensure everything that
>>> was queued has completed.
>>
>> Hm, I read the comment for the flush_workqueue() as addressing the "PCI_BUS_RELATIONS[2]
>> message arrived before we sent the QUERY_BUS_RELATIONS message" race case, not as an
>> "all child devices have definitely been received and processed in response to our
>> QUERY_BUS_RELATIONS message". Also, knowing very little about the VMBus contract, I
>> discounted the 100 ms timeout in wait_for_response() as a serialization guarantee.
>
> Yeah, that timeout is so that the code can wake up every 100 ms to check
> if the device has been rescinded (i.e., removed). If the device isn't
> rescinded, wait_for_response() waits forever until a response comes in.
I don't know how I missed that. :(
>>
>> Chalk it up to previous experience dealing with hardware that's *supposed* to be
>> spec-compliant and complete initialization within specified timings. :)
>>
>> I see now that the flush is sufficient though.
>>
>>>
>>> Thinking more about the "hv_pcibus_installed" case, if that path is ever
>>> triggered, I don't think anything needs to be done with the logical device ID.
>>> The vPCI device has already been fully initialized on the Linux side, and it's
>>> logical device ID would not change.
>>>
>>> So I think you could construct the full logical device ID once
>>> hv_pci_query_relations() returns to hv_pci_probe().
>>
>> Let me think about this more and decide between the logical ID and full bus GUID
>> options.
>>
>>>
>>>> safest to take the approach to communicate the GUID, and find the function number from
>>>> the pci_dev. This does mean that there will be an essentially identical copy of
>>>> hv_build_logical_dev_id() in the IOMMU code, but a comment can explain that.
>>>
>>> With this alternative approach, is there a need to communicate the full
>>> GUID to the pvIOMMU drvier? Couldn't you just communicate bytes 4 thru
>>> 7, which would be logical device ID minus the function number?
>>
>> Yes, we could just communicate bytes 4 through 7 but the pvIOMMU version of the build logical
>> ID function would diverge from the pci-hyperv version. I figured if we say (in a comment)
>> that this is the same ID as generated in pci-hyperv, it's better for future readers to see it
>> to be clearly identical at first glance.
>>
>> It's also possible to change the pci-hyperv function to only take bytes 4 through 7 instead of the
>> full GUID, but I rather think we don't need that impedance mismatch of bytes 4 through 7 of the
>> GUID becoming bytes 0 through 3 of a u32.
>>
>>>
<snip>
>>
>>>>>>>
>>>>>>> I don't think the pci-hyperv driver even needs to tell the IOMMU driver to
>>>>>>> remove the information if a PCI pass-thru device is unbound or removed, as
>>>>>>> the logical device ID will be the same if the device ever comes back. At worst,
>>>>>>> the IOMMU driver can simply replace an existing logical device ID if a new one
>>>>>>> is provided for the same PCI domain ID.
>>>>>>
>>>>>> As above, replacing a unique GUID when a result is found for a non-unique
>>>>>> key value may be prone to failure if it happens that the device that came "back"
>>>>>> is not in fact the same device (or class of device) that went away and just happens
>>>>>> to, either due to bytes 4 and 5 being identical, or due to collision in the
>>>>>> pci_domain_nr_dynamic_ida, have the same domain number.
>>>>
>>>> Given the vPCI team's statements (above), I think we will need to handle unbind or
>>>> removal and ensure the pvIOMMU drivers data structure is invalidated when either
>>>> happens.
>>>
>>> The generic PCI code should handle detaching from the pvIOMMU. So I'm assuming
>>> your statement is specifically about the mapping from domain ID to logical device ID.
>>
>> Yes, apologies for the vagueness (again).
>>
>>> I still think removing it may be unnecessary since adding a mapping for a new vPCI
>>> device with the same domain ID but different logical device ID could just overwrite
>>> any existing mapping. And leaving a dead mapping in the pvIOMMU data structures
>>> doesn’t actually hurt anything. On the other hand, removing/invalidating it is
>>> certainly more tidy and might prevent some confusion down the road.
>>>
>>
>> Yes, if the data structure maps domain -> logical ID, we can do the overwrite as you say.
>> With my approach of informing the pvIOMMU driver of the entire (bus) GUID, we would want
>> to be careful that we don't assume the 1:1 bus<->device case and overwrite an existing
>> device entry with a new device that's on the same bus.
>
> Yes, that's a valid point. I was assuming that the pvIOMMU would use the
> domain ID at the lookup key, since the domain ID is directly available from the
> struct pci_dev that is an input parameter to the IOMMU functions. But in the
> not 1:1 case, that domain ID might refer to a bus with multiple functions. The
> logical device IDs for those devices will be the same except for the low order
> 3 bits that encode with the function number. So maybe the domain ID maps
> to a partial logical device ID, and the pvIOMMU driver must always add in the
> function number so the not 1:1 case works.
Agreed.
>
> Would the pvIOMMU driver do anything with the full GUID, except extract
> bytes 4 through 7? There's no way I see to use the full GUID as the lookup
> key.
>
No, the hypercalls only use the logical ID, so the rest of the GUID is unused.
Thanks,
Easwar (he/him)