Re: [PATCH v4 24/26] arch_numa: switch over to numa_memblks

From: Marc Zyngier
Date: Wed Nov 27 2024 - 14:32:28 EST


Hi Mike,

Sorry for reviving a rather old thread.

On Wed, 07 Aug 2024 07:41:08 +0100,
Mike Rapoport <rppt@xxxxxxxxxx> wrote:
>
> From: "Mike Rapoport (Microsoft)" <rppt@xxxxxxxxxx>
>
> Until now arch_numa was directly translating firmware NUMA information
> to memblock.
>
> Using numa_memblks as an intermediate step has a few advantages:
> * alignment with more battle tested x86 implementation
> * availability of NUMA emulation
> * maintaining node information for not yet populated memory
>
> Adjust a few places in numa_memblks to compile with 32-bit phys_addr_t
> and replace current functionality related to numa_add_memblk() and
> __node_distance() in arch_numa with the implementation based on
> numa_memblks and add functions required by numa_emulation.
>
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx>
> Tested-by: Zi Yan <ziy@xxxxxxxxxx> # for x86_64 and arm64
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> Tested-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> [arm64 + CXL via QEMU]
> Acked-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> Acked-by: David Hildenbrand <david@xxxxxxxxxx>
> ---
> drivers/base/Kconfig | 1 +
> drivers/base/arch_numa.c | 201 +++++++++++--------------------------
> include/asm-generic/numa.h | 6 +-
> mm/numa_memblks.c | 17 ++--
> 4 files changed, 75 insertions(+), 150 deletions(-)
>

[...]

> static int __init numa_register_nodes(void)
> {
> int nid;
> - struct memblock_region *mblk;
> -
> - /* Check that valid nid is set to memblks */
> - for_each_mem_region(mblk) {
> - int mblk_nid = memblock_get_region_node(mblk);
> - phys_addr_t start = mblk->base;
> - phys_addr_t end = mblk->base + mblk->size - 1;
> -
> - if (mblk_nid == NUMA_NO_NODE || mblk_nid >= MAX_NUMNODES) {
> - pr_warn("Warning: invalid memblk node %d [mem %pap-%pap]\n",
> - mblk_nid, &start, &end);
> - return -EINVAL;
> - }
> - }
>

This hunk has the unfortunate side effect of killing my ThunderX
extremely early at boot time, as this sorry excuse for a machine
really relies on the kernel recognising that whatever NUMA information
the FW offers is BS.

Reverting this hunk restores happiness (sort of).

FWIW, I've posted a patch with such revert at [1].

Thanks,

M.

[1] https://lore.kernel.org/r/20241127193000.3702637-1-maz@xxxxxxxxxx

--
Without deviation from the norm, progress is not possible.