Bjorn,[]
On a typical AMD system, there are two types of host bridges:
* PCI Root Complex Host bridge (e.g. RD890, SR56xx, etc.)
* CPU Host bridge
Here is an example from a 2 sockets system:
$ lspci
The host bridge 00:00.0 is basically the PCI root complex which connects
to the actual PCI bus with
PCI devices hanging off of it. However, the host bridge 00:[18,19].x
are the CPU host bridges,
each of which represents a CPU node within the system. In system with
single root complex,
the root complex is normally connected to node 0 (i.e. 00:18.0) via
non-coherent HT (I/O) link.
Even though the CPU host bridge 00:[18,19].x is on the same bus as the
PCI root complex, it should
not be using the NUMA information from the PCI root complex host bridge.
Therefore, I don't think we should be using the pcibus_to_node(dev->bus)
here.
Only the "val" from pci_read_config_dword(nb_ht, 0x60, &val), should be
used here.
On 3/20/2014 5:07 PM, Bjorn Helgaas wrote:
[+cc linux-pci, Myron, Suravee, Kim, Aravind]
On Thu, Mar 13, 2014 at 5:43 AM, Daniel J Blueman
<daniel@xxxxxxxxxxxxx> wrote:
For systems with multiple servers and routed fabric, all northbridges
get
assigned to the first server. Fix this by also using the node
reported from
the PCI bus. For single-fabric systems, the northbriges are on PCI bus 0
by definition, which are on NUMA node 0 by definition, so this is
invarient
on most systems.
Tested on fam10h and fam15h single and multi-fabric systems and
candidate
for stable.
I wish this had been cc'd to linux-pci. We're talking about a related
change by Suravee there. In fact, we were hoping this quirk could be
removed altogether.
I don't understand what this quirk is doing. Normally we discover the
NUMA node for a PCI host bridge via the ACPI _PXM method. The way
_PXM works is that every PCI device in the hierarchy below the bridge
inherits the same node number as the host bridge. I first thought
this might be a workaround for a system that lacks _PXM, but I don't
think that can be right, because you're only changing the node for a
few devices, not the whole hierarchy.
So I suspect the problem is more complicated, and maybe _PXM is
insufficient to describe the topology? Are there subtrees that should
have nodes different from the host bridge?
I know this patch is already in v3.14-rc7, but I'd still like to
understand it so we can do the right thing with Suravee's patch.
Bjorn
Signed-off-by: Daniel J Blueman <daniel@xxxxxxxxxxxxx>
Acked-by: Steffen Persvold <sp@xxxxxxxxxxxxx>
---
arch/x86/kernel/quirks.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
index 04ee1e2..52dbf1e 100644
--- a/arch/x86/kernel/quirks.c
+++ b/arch/x86/kernel/quirks.c
@@ -529,7 +529,7 @@ static void quirk_amd_nb_node(struct pci_dev *dev)
return;
pci_read_config_dword(nb_ht, 0x60, &val);
- node = val & 7;
+ node = pcibus_to_node(dev->bus) | (val & 7);
/*
* Some hardware may return an invalid node ID,
* so check it first:
--
1.8.3.2
--
To unsubscribe from this list: send the line "unsubscribe
linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/