Re: [PATCH 4/8] PCI: Add quirk to disable MSI support for Amazon's Annapurna Labs host bridge

From: Chocron, Jonathan
Date: Sun Jul 14 2019 - 11:09:39 EST


On Fri, 2019-07-12 at 08:04 -0500, Bjorn Helgaas wrote:
> On Thu, Jul 11, 2019 at 05:56:25PM +0300, Jonathan Chocron wrote:
> > On some platforms, the host bridge exposes an MSI-X capability but
> > doesn't actually support it.
> > This causes a crash during initialization by the pcieport driver,
> > since
> > it tries to configure the MSI-X capability.
>
> Nit: The formatting above is jarring to read because I can't tell
> whether it's one paragraph or two.
>
> Either rewrap it into a single paragraph or add a blank line to make
> two paragraphs. I noticed this elsewhere, too, in a comment, I
> think.
>
Ack.

> s/host bridge/Root Port/, if I understand correctly.
>
Ack.

BTW, what is the main difference between the 2 terms, since they seem
to be (mistakenly?) used interchangeably?

> I don't understand the "on some platforms..." part. Do you mean that
> on *every* platform, this particular host bridge (identified by
> [1c36:0031]) advertises an MSI-X capability that doesn't work?
>
> Or are there some platforms that configure the bridge so it doesn't
> advertise MSI-X at all, while other platforms configure it so it
> *does* advertise MSI-X?
>
The MSI-x capability isn't supported for this specific host bridge
([1c36:0031]). On some platforms, it is configured to not advertise the
capability at all, while on others it (mistakenly) does advertise it.

I've updated the commit message to be more explicit.

> If there's a line or two of diagnostics from the crash you could
> include here, that would help people who encounter the crash find
> the solution.
>
Sure, I'll add a partial stacktrace (a bit more than a couple of lines,
but I feel it will be too ambiguous otherwise).

> > Signed-off-by: Jonathan Chocron <jonnyc@xxxxxxxxxx>
> > ---
> > drivers/pci/quirks.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index 11850b030637..0fb70d755977 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -2925,6 +2925,14 @@
> > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0x10a1,
> > quirk_msi_intx_disable_qca_bug);
> > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0xe091,
> > quirk_msi_intx_disable_qca_bug);
> > +
> > +static void quirk_al_msi_disable(struct pci_dev *dev)
> > +{
> > + dev->no_msi = 1;
> > + dev_warn(&dev->dev, "Annapurna Labs pcie quirk - disabling
> > MSI\n");
>
> s/pcie/PCIe/ in English text, comments, printk strings, etc.
>
Ack.

> Actually, I think the whole "Annapurna Labs pcie quirk" part is
> probably unnecessary, since we can identify the device via the
> dev_printk() info.
>
Ack.

> Speaking of which, you can use "pci_warn(dev)" here to be consistent
> with the rest of the file.
>
Ack.

> > +}
> > +DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_AMAZON_ANNAPURNA_LABS,
> > 0x0031,
> > + PCI_CLASS_BRIDGE_PCI, 8,
> > quirk_al_msi_disable);
>
> Why do you use the class fixup here instead of the simpler
> DECLARE_PCI_FIXUP_FINAL()? Requiring the class to match
> PCI_CLASS_BRIDGE_PCI suggests that there may be other [1c36:0031]
> devices that are not Root Ports. If that's the case, please mention
> it so it's clear why we need DECLARE_PCI_FIXUP_CLASS_FINAL(). If
> not,
> just use DECLARE_PCI_FIXUP_FINAL().
>
This is indeed the case. What do you say about adding the following
comment before the function's definition:
/*
* Amazon's Annapurna Labs 1c36:0031 Root Ports don't support MSI-X, so
it
* should be disabled on platforms where the device (mistakenly)
advertises it.
*
* The 0031 device id is reused for other non Root Port device types,
* therefore the quirk is registered for the PCI_CLASS_BRIDGE_PCI class
only.
*/

> #endif /* CONFIG_PCI_MSI */

/*
--
2.17.1