Re: [PATCH 0/3] amd/pci: Add AMD hostbridge supports for newer AMD systems

From: Suravee Suthikulpanit
Date: Thu Mar 06 2014 - 15:03:27 EST


On 3/6/2014 11:40 AM, Bjorn Helgaas wrote:
[+cc Yinghai, sorry I didn't think of it before]

On Wed, Mar 5, 2014 at 11:30 PM, Suravee Suthikulpanit
<suravee.suthikulpanit@xxxxxxx> wrote:
On 3/5/2014 8:13 PM, Suravee Suthikulanit wrote:

On 3/5/2014 3:24 PM, Bjorn Helgaas wrote:

[+cc linux-acpi]

On Wed, Mar 5, 2014 at 2:06 PM, <suravee.suthikulpanit@xxxxxxx> wrote:

From: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>

The current code only supports upto AMD hostbridge for family11h.
This causes PCI numa_node information to be reported incorrectly
for newer family with multi sockets.


Where is the incorrect reporting? In ACPI tables? Is this patch a
way to cover up firmware defects in the ACPI description? Or is this
for machines without ACPI (it seems unlikely that machines with new
AMD processors would not have ACPI)?


This is incorrectly reported in the sysfs for each PCI device (e.g.
/devices/pci0000:50/0000:50:00.2/numa_node). Without the patch, they
return -1.

In file arch/x86/pci/acpi.c, in function pci_acpi_scan_root(), it is
queries the node information as following:

#ifdef CONFIG_ACPI_NUMA
pxm = acpi_get_pxm(device->handle);
if (pxm >= 0)
node = pxm_to_node(pxm);
if (node != -1)
set_mp_bus_to_node(busnum, node);
else
#endif
node = get_mp_bus_to_node(busnum);

In this case, I see that the acpi_get_pxm() returns -1. Therefore, it
falls back to using the node information in mp_bus_to_node[]. So,
without this patch, it would also returning -1.

Also, the spec mentioned that the _PXM is optional, so I am not sure if
this is a firmware bug.

I am not quite familiar with the ACPI for this part. However, after taking
a look at the code (in driver/acpi/pci_root.c: acpi_pci_root_add()), I
believe it's trying to locate _PXM method in the DSDT table, in which I
don't see any _PXM methods.

This sure looks like a firmware bug. True, _PXM is optional, but if
the firmware doesn't provide it, nobody should be surprised that the
OS thinks everything is in the same proximity domain.

I would not endorse extending amd_bus.c for new CPUs. That just
covers up firmware problems like this, and if you ever run a different
OS on the box, you'll trip over them again. And I don't think a patch
like this will even be a possibility for Windows.

Bjorn


I understand and am trying to verify this with the BIOS engineers. However, this is currently affecting family15h servers out in the field. We can try to fix ACPI for newer generation of machines, but it won't be practical to push this BIOS fix to all the BIOS vendors and system vendors for older platforms, as they tend to.

What if I localize the extension to the changes to access node information in the hostbridge for just the famil15h which is mostly used in our main server products? Would that be acceptable?

Suravee

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/