Re: [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function

From: Chen, Jiqian
Date: Sun Dec 03 2023 - 22:45:29 EST


Hi Stewart,

On 2023/11/30 23:03, Stewart Hildebrand wrote:
> On 11/30/23 02:03, Chen, Jiqian wrote:
>>
>> On 2023/11/30 11:46, Stefano Stabellini wrote:
>>> On Fri, 24 Nov 2023, Jiqian Chen wrote:
>>>> When device on dom0 side has been reset, the vpci on Xen side
>>>> won't get notification, so that the cached state in vpci is
>>>> all out of date with the real device state.
>>>> To solve that problem, this patch add a function to clear all
>>>> vpci device state when device is reset on dom0 side.
>>>>
>>>> And call that function in pcistub_init_device. Because when
>>>> we use "pci-assignable-add" to assign a passthrough device in
>>>> Xen, it will reset passthrough device and the vpci state will
>>>> out of date, and then device will fail to restore bar state.
>>>>
>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@xxxxxxx>
>>>> Signed-off-by: Huang Rui <ray.huang@xxxxxxx>
>>>> ---
>>>> drivers/xen/pci.c | 12 ++++++++++++
>>>> drivers/xen/xen-pciback/pci_stub.c | 3 +++
>>>> include/xen/interface/physdev.h | 2 ++
>>>> include/xen/pci.h | 6 ++++++
>>>> 4 files changed, 23 insertions(+)
>>>>
>>>> diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c
>>>> index 72d4e3f193af..e9b30bc09139 100644
>>>> --- a/drivers/xen/pci.c
>>>> +++ b/drivers/xen/pci.c
>>>> @@ -177,6 +177,18 @@ static int xen_remove_device(struct device *dev)
>>>> return r;
>>>> }
>>>>
>>>> +int xen_reset_device_state(const struct pci_dev *dev)
>>>> +{
>>>> + struct physdev_pci_device device = {
>>>> + .seg = pci_domain_nr(dev->bus),
>>>> + .bus = dev->bus->number,
>>>> + .devfn = dev->devfn
>>>> + };
>>>> +
>>>> + return HYPERVISOR_physdev_op(PHYSDEVOP_pci_device_state_reset, &device);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(xen_reset_device_state);
>>>> +
>>>> static int xen_pci_notifier(struct notifier_block *nb,
>>>> unsigned long action, void *data)
>>>> {
>>>> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
>>>> index e34b623e4b41..5a96b6c66c07 100644
>>>> --- a/drivers/xen/xen-pciback/pci_stub.c
>>>> +++ b/drivers/xen/xen-pciback/pci_stub.c
>>>> @@ -421,6 +421,9 @@ static int pcistub_init_device(struct pci_dev *dev)
>>>> else {
>>>> dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
>>>> __pci_reset_function_locked(dev);
>>>> + err = xen_reset_device_state(dev);
>>>> + if (err)
>>>> + goto config_release;
>>>
>>> Older versions of Xen won't have the hypercall
>>> PHYSDEVOP_pci_device_state_reset implemented. I think we should do
>>> something like:
>>>
>>> if (err && xen_pvh_domain())
>>> goto config_release;
>>>
>>>
>>> Or even:
>>>
>>> if (xen_pvh_domain()) {
>>> err = xen_reset_device_state(dev);
>>> if (err)
>>> goto config_release;
>>> }
>>>
>>> depending on whether we want to call xen_reset_device_state also for PV
>>> guests or not. I am assuming we don't want to error out on failure such
>>> as -ENOENT for PV guests.
>> Yes, only for PVH dom0, I will add the condition in next version. Thank you!
>
> We will want to call xen_reset_device_state() for Arm dom0, too, so checking xen_pvh_domain() alone is not sufficient. I suggest instead to check !xen_pv_domain().
I am not using Arm. But is Arm dom0 not a PVH type dom0?

>
>>
>>>
>>>
>>>> pci_restore_state(dev);
>>>> }
>>>> /* Now disable the device (this also ensures some private device
>>>> diff --git a/include/xen/interface/physdev.h b/include/xen/interface/physdev.h
>>>> index a237af867873..231526f80f6c 100644
>>>> --- a/include/xen/interface/physdev.h
>>>> +++ b/include/xen/interface/physdev.h
>>>> @@ -263,6 +263,8 @@ struct physdev_pci_device {
>>>> uint8_t devfn;
>>>> };
>>>>
>>>> +#define PHYSDEVOP_pci_device_state_reset 32
>>>> +
>>>> #define PHYSDEVOP_DBGP_RESET_PREPARE 1
>>>> #define PHYSDEVOP_DBGP_RESET_DONE 2
>>>>
>>>> diff --git a/include/xen/pci.h b/include/xen/pci.h
>>>> index b8337cf85fd1..b2e2e856efd6 100644
>>>> --- a/include/xen/pci.h
>>>> +++ b/include/xen/pci.h
>>>> @@ -4,10 +4,16 @@
>>>> #define __XEN_PCI_H__
>>>>
>>>> #if defined(CONFIG_XEN_DOM0)
>>>> +int xen_reset_device_state(const struct pci_dev *dev);
>>>> int xen_find_device_domain_owner(struct pci_dev *dev);
>>>> int xen_register_device_domain_owner(struct pci_dev *dev, uint16_t domain);
>>>> int xen_unregister_device_domain_owner(struct pci_dev *dev);
>>>> #else
>>>> +static inline int xen_reset_device_state(const struct pci_dev *dev)
>>>> +{
>>>> + return -1;
>>>> +}
>>>> +
>>>> static inline int xen_find_device_domain_owner(struct pci_dev *dev)
>>>> {
>>>> return -1;
>>>> --
>>>> 2.34.1
>>>>
>>

--
Best regards,
Jiqian Chen.