Re: [tip:x86/urgent] x86/quirks: Add early quirk to reset Apple AirPort card

From: Bjorn Helgaas
Date: Fri Jun 10 2016 - 16:08:12 EST


[+cc Matthew, Doug]

On Fri, Jun 10, 2016 at 03:59:04PM +0200, Lukas Wunner wrote:
> On Fri, Jun 10, 2016 at 02:59:57PM +0200, Ingo Molnar wrote:
> > * Lukas Wunner <lukas@xxxxxxxxx> wrote:
> > > On Fri, Jun 10, 2016 at 01:58:45PM +0200, Ingo Molnar wrote:
> > > > * Yinghai Lu <yinghai@xxxxxxxxxx> wrote:
> > > > > On 6/9/16, Lukas Wunner <lukas@xxxxxxxxx> wrote:
> > > > > > Well, the PCI core would also scan such a bus twice AFAICS.
> > > > > > And the performance penalty of scanning it twice seems negligible.
> > > > > > Early quirks can prevent double execution by setting QFLAG_APPLY_ONCE.
> > > > > > (Three quirks have set that flag already.)
> > > > > >
> > > > > > So I think this shouldn't be a concern.
> > > > >
> > > > > I don't know. I would like see sth like following, and that is simple
> > > > > enough.
> > > > >
> > > > > --- linux-2.6.orig/arch/x86/kernel/early-quirks.c
> > > > > +++ linux-2.6/arch/x86/kernel/early-quirks.c
> > > > > @@ -755,10 +755,16 @@ static int __init check_dev_quirk(int nu
> > > > > return 0;
> > > > > }
> > > > >
> > > > > +static unsigned char __initdata scanned[256];
> > > > > static void __init early_pci_scan_bus(int bus)
> > > > > {
> > > > > int slot, func;
> > > > >
> > > > > + if (scanned[bus])
> > > > > + return;
> > > > > +
> > > > > + scanned[bus] = 1;
> > > > > +
> > > > > /* Poor man's PCI discovery */
> > > > > for (slot = 0; slot < 32; slot++)
> > > > > for (func = 0; func < 8; func++) {
> > > >
> > > > Ok, I removed the fix from tip:x86/urgent from the time being - could you
> > > > guys please send a full version once a final approach is agreed upon?
> > >
> > > IMHO the above patch to prevent double scanning isn't needed
> > > and less code is usually better. So my suggestion would be the
> > > patch as originally sent plus the delta fix I sent yesterday,
> > > either squashed or applied separately.
> > >
> > > Since Yinghai Lu seems to disagree I guess you as the maintainer
> > > will have to make a decision. :-)
> >
> > So I'd lean towards lower complexity, but since this is essentially
> > PCI code I'd like to defer to Bjorn on that detail.

Since you asked my opinion, here it is, but it's probably more than
you wanted :)

This is a serious problem; thank you, Lukas, for chasing this all
down and for the detailed changelog.

- I would split the early_pci_scan_bus() and Nvidia changes to a
separate patch. It's a little bit more to track, but will make it
easier to review, bisect, and revert.

- The delta ("Validate secondary bus number") patch looks good and
should be folded into the early_pci_scan_bus() patch.

- I know you can't use dev_printk, but you could fake it by hand,
e.g., pr_err("pci 0000:%02x:%02x.%d: Cannot power up ...\n");

- The "Resetting" message could mention the reason, e.g., "to
workaround platform firmware issue".

> If I may add some additional information:
>
> drivers/pci/probe.c contains the following comment:
>
> /*
> * The bus might already exist for two reasons: Either we are
> * rescanning the bus or the bus is reachable through more than
> * one bridge. The second case can happen with the i450NX
> * chipset.

- In your case, I do not think it is necessary to keep track of which
buses have been scanned.

The i450NX comment about reaching a bus via multiple bridges was
added by Matthew Wilcox in
http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/?id=82987569b869

I also found this enlightening explanation from Doug Ledford at
https://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0/2.6.0-mm1/broken-out/i450nx-scanning-fix.patch:

On Sun, Nov 09, 2003 at 11:51:36AM -0500, Doug Ledford wrote:
> I can tell you what's going on here. This is a 450NX based
> motherboard. The 450NX chipset from Intel was the first chipset
> to have peer PCI busses. For backwards compatibility, some
> machine makers hacked their PCI BIOS to have a fake bridge
> device on PCI bus 0 that points to the same bus number as the
> peer bus. This way if the OS didn't know about the peer bus
> registers it would still find the devices by scanning behind the
> bridge.

So if we booted one of those i450NX machines today, we would see
something like this:

ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-7f])
pci 0000:00:1c.0: PCI bridge to [bus 80-ff]
ACPI: PCI Root Bridge [PCI1] (domain 0000 [bus 80-ff])

x86 asks the PCI core to scan the PCI0 hierarchy starting at bus 00
(including any subtree on bus 80), then to scan the PCI1 hierarchy
starting at bus 80, so it has to keep track of what it has seen.

In your case, you're scanning only the hierarchy starting at bus 00.
You could see a subtree at bus 80, but since you never initiate a
scan from bus 80, you should never see duplicates.

Bjorn