Re: [RFC PATCH] PCI: avoid SBR for NVIDIA T4

From: Wu Zongyong
Date: Wed Mar 29 2023 - 22:10:24 EST


On Wed, Mar 29, 2023 at 12:05:15PM -0500, Bjorn Helgaas wrote:
> On Wed, Mar 29, 2023 at 07:58:45PM +0800, Wu Zongyong wrote:
> > Secondary bus reset will fail if NVIDIA T4 card is direct attached to a
> > root port.
>
> Blank line between paragraphs. Rewrap to fill 75 columns if it's a
> single paragraph.
Will be fixed.
>
> Is this only a problem when direct attached to a Root Port? Why would
> that be? If it's *not* related to being directly under a Root Port,
> don't mention that at all.
Yes, this problem occurs only when the T4 card is direct attached to a
Root Port.
I have test it with a T4 card attached to a PCIe Switch or a PCI Bridge,
and it works well.

>
> > So avoid to do bus reset, pci_parent_bus_reset() works nomarlly.
> >
> > Maybe NVIDIA guys can do some detailed explanation abount the SBR
> > behaviour of GPUs.
>
> This is a follow-on to 4c207e7121fa ("PCI: Mark some NVIDIA GPUs to
> avoid bus reset"), so probably should have a Fixes: tag so it goes
> whereever that commit goes.
>
> Also copy the subject line from 4c207e7121fa, e.g.,
>
> PCI: Mark NVIDIA T4 GPUs to avoid bus reset
Will be fixed too.
>
> Are there any problem reports or bugzilla issues you can include a URL
> to?
No, I just find the problem in our test environment and I didn't find a
similar report.
>
> > Signed-off-by: Wu Zongyong <wuzongyong@xxxxxxxxxxxxxxxxx>
> > ---
> > drivers/pci/quirks.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index 44cab813bf95..be86b93b9e38 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -3618,7 +3618,7 @@ static void quirk_no_bus_reset(struct pci_dev *dev)
> > */
> > static void quirk_nvidia_no_bus_reset(struct pci_dev *dev)
> > {
> > - if ((dev->device & 0xffc0) == 0x2340)
> > + if ((dev->device & 0xffc0) == 0x2340 || dev->device == 0x1eb8)
> > quirk_no_bus_reset(dev);
> > }
> > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
> > --
> > 2.34.3
> >