On Mon, Sep 28, 2020 at 10:49:57PM +0800, Baolin Wang wrote:
On Mon, Sep 28, 2020 at 03:00:55PM +0100, Will Deacon wrote:
[+ Lorenzo]
On Tue, Sep 22, 2020 at 06:33:24PM +0800, Baolin Wang wrote:
If the BIOS disabled the NUMA configuration, but did not change the
proximity domain description in the SRAT table, so the PCI root bus
device may get a incorrect node id by acpi_get_node().
How "incorrect" are we talking here? What actually goes wrong? At some
point, we have to trust what the firmware is telling us.
What I mean is, if we disable the NUMA from BIOS
Please define what this means ie are you removing SRAT from ACPI static
tables ?
but we did not change the PXM for the PCI devices,
If a _PXM maps to a proximity domain that is not described in the SRAT
your firmware is buggy.
so the PCI devices can still get a numa node id from acpi_get_node().
For example, we can still get the numa node id = 1 in this case from
acpi_get_node(), but the numa_nodes_parsed is empty, which means the
node id 1 is invalid. We should add a validation for the node id when
setting the root bus node id.
The kernel is not a firmware validation test suite, so fix the firmware
please.
Having said that, please provide a trace log of the issue this is
causing, if any.