RE: [PATCH v4] PCI: hv: Fix a timing issue which causes kdump to fail occasionally

From: Wei Hu
Date: Mon Jul 27 2020 - 08:10:51 EST




> -----Original Message-----
> From: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx>
> Sent: Monday, July 27, 2020 7:19 PM
> To: Wei Hu <weh@xxxxxxxxxxxxx>
> Cc: KY Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang Zhang
> <haiyangz@xxxxxxxxxxxxx>; Stephen Hemminger <sthemmin@xxxxxxxxxxxxx>;
> wei.liu@xxxxxxxxxx; robh@xxxxxxxxxx; bhelgaas@xxxxxxxxxx; linux-
> hyperv@xxxxxxxxxxxxxxx; linux-pci@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; Dexuan Cui <decui@xxxxxxxxxxxxx>; Michael Kelley
> <mikelley@xxxxxxxxxxxxx>
> Subject: Re: [PATCH v4] PCI: hv: Fix a timing issue which causes kdump to fail
> occasionally
>
> On Mon, Jul 27, 2020 at 03:17:31PM +0800, Wei Hu wrote:
> > Kdump could fail sometime on Hyper-V guest over Accelerated Network
> > interface. This is because the retry in hv_pci_enter_d0() releases
> > child device strurctures in hv_pci_bus_exit(). Although there is a
> > second asynchronous device relations message sending from the host, if
> > this message arrives guest after hv_send_resource_allocated() is
> > called, the retry would fail.
> >
> > Fix the problem by moving retry to hv_pci_probe() and starting retry
> > from hv_pci_query_relations() call. This will cause a device
> > relations message to arrive guest synchronously. The guest would be
> > able to rebuild the child device structures before calling
> > hv_send_resource_allocated().
> >
> > This problem only happens on Accelerated Network or SRIOV devices as
> > only such devices currently are attached under vmbus PCI bridge.
> >
> > Fixes: c81992e7f4aa ("PCI: hv: Retry PCI bus D0 entry on invalid
> > device state")
> > Signed-off-by: Wei Hu <weh@xxxxxxxxxxxxx>
> > Reviewed-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>
> > ---
> > v2: Adding Fixes tag according to Michael Kelley's review comment.
> > v3: Fix couple typos and reword commit message to make it clearer.
> > Thanks the comments from Bjorn Helgaas.
> > v4: Adding more problem descritpions in the commit message
> > and code upon request from Lorenze Pieralisi.
> >
> > drivers/pci/controller/pci-hyperv.c | 71
> > +++++++++++++++--------------
> > 1 file changed, 37 insertions(+), 34 deletions(-)
> >
> > diff --git a/drivers/pci/controller/pci-hyperv.c
> > b/drivers/pci/controller/pci-hyperv.c
>
> I edited commit log and a comment in the code to fix a typo and pushed out to
> pci/hv.
>
> Thanks,
> Lorenzo
>
Thanks Lorenzo. Appreciate your helps!

Wei