Re: [PATCH] x86/PCI: Claim the resources of firmware enabled IOAPIC before children bus

From: Bjorn Helgaas
Date: Fri Aug 10 2018 - 09:59:36 EST


On Fri, Aug 10, 2018 at 05:25:01PM +0800, joeyli wrote:
> On Wed, Aug 08, 2018 at 04:23:22PM -0500, Bjorn Helgaas wrote:
> ...

> The lspci log shows "Normal decode" on the bridge, I think that means
> positively decode.

Right.

> hm... I have another question that it may not relates to this issue. I
> was tracing the code path of PCI hot-remove/hotplug. Base on spec, looks
> that the RST# should be asserted when hot-remove. And the memory decode
> bit must be set to zero after RST# be asserted. But I didn't see that
> any kernel PCI/ACPI code set RST#. The only possible code to set RST# is
> in POWER architecture. Do you know who assert the RST# when hot-remove?

RST# is a conventional PCI signal (not a PCIe signal). In any case, I
would expect signals like that to be handled by hardware, not by
software. What section of the spec are you looking at? I wouldn't
expect any requirements for doing things to a device when the device
is being hot-removed, since the device may already be inaccessible,
e.g., physically unreachable.

On a hot-*add*, there would of course be requirements about how the
device powers up and comes out of reset. For native drivers like
pciehp/shpcpd/etc, there are often ways for software to control power
to the slot, e.g., the "Power Controller Control" bit in the PCIe Slot
Control register.

For ACPI-mediated hotplug (as in your situation), the actual hardware
details are handled by the firmware and all the OS sees are things
like ACPI Notify events and it uses methods like _STA and other things
mentioned in ACPI v6.2, sec 6.3.

> > What are the chances of getting a firmware fix? Has this firmware
> > already shipped to customers?
>
> The good news is that the machine has not shipped yet. As I know
> that manufacturer is also finding the root cause for why firmware
> enabled memory decode bit and also set the wrong addresses.

I don't think it's necessarily a problem that firmware enables the
IOAPIC. This is ACPI-mediated hotplug and it looks like it adds CPUs,
memory, and I/O. I wouldn't be surprised if the firmware has to make
the IOAPIC operational to make some parts of the hot-add work.

The address conflict is the real problem.

Bjorn