Re: [PATCH] LoongArch: Fix cpu hotplug issue
From: Huacai Chen
Date: Mon Oct 14 2024 - 04:27:09 EST
On Mon, Oct 14, 2024 at 4:01 PM maobibo <maobibo@xxxxxxxxxxx> wrote:
>
> Huacai,
>
> On 2024/10/14 下午3:39, Huacai Chen wrote:
> > On Mon, Oct 14, 2024 at 3:21 PM maobibo <maobibo@xxxxxxxxxxx> wrote:
> >>
> >> Huacai,
> >>
> >> On 2024/10/14 下午3:05, Huacai Chen wrote:
> >>> Hi, Bibo,
> >>>
> >>> I'm a little confused, so please correct me if I'm wrong.
> >>>
> >>> On Mon, Oct 14, 2024 at 2:33 PM Bibo Mao <maobibo@xxxxxxxxxxx> wrote:
> >>>>
> >>>> On LoongArch system, there are two places to set cpu numa node. One
> >>>> is in arch specified function smp_prepare_boot_cpu(), the other is
> >>>> in generic function early_numa_node_init(). The latter will overwrite
> >>>> the numa node information.
> >>>>
> >>>> However for hot-added cpu, cpu_logical_map() fails to its physical
> >>>> cpuid at beginning since it is not enabled in ACPI MADT table. So
> >>>> function early_cpu_to_node() also fails to get its numa node for
> >>>> hot-added cpu, and generic function early_numa_node_init() will
> >>>> overwrite incorrect numa node.
> >>> For hot-added cpus, we will call acpi_map_cpu() -->
> >>> acpi_map_cpu2node() --> set_cpuid_to_node(), and set_cpuid_to_node()
> >>> operates on __cpuid_to_node[]. So I think early_cpu_to_node() should
> >>> be correct?
> >>
> >> __cpuid_to_node[] is correct which is physical cpuid to numa node,
> >> however cpu_logical_map(cpu) is not set. It fails to get physical cpuid
> >> from logic cpu.
> >>
> >> int early_cpu_to_node(int cpu)
> >> {
> >> int physid = cpu_logical_map(cpu);
> >>
> >> <<<<<<<<<<< Here physid is -1.
> > early_cpu_to_node() is not supposed to be called after boot, and if it
> Which calls early_cpu_to_node() after boot?
>
> > is really needed, I think a better solution is:
> >
> > diff --git a/arch/loongarch/kernel/acpi.c b/arch/loongarch/kernel/acpi.c
> > index f1a74b80f22c..998cf45fd3b7 100644
> > --- a/arch/loongarch/kernel/acpi.c
> > +++ b/arch/loongarch/kernel/acpi.c
> > @@ -311,6 +311,8 @@ static int __ref acpi_map_cpu2node(acpi_handle
> > handle, int cpu, int physid)
> >
> > nid = acpi_get_node(handle);
> > if (nid != NUMA_NO_NODE) {
> > + __cpu_number_map[physid] = cpu;
> > + __cpu_logical_map[cpu] = physid;
> This does not solve the problem. The above has been done in function
> cpu = set_processor_mask(physid, ACPI_MADT_ENABLED);
>
> static int set_processor_mask(u32 id, u32 flags)
> {
> ...
> if (flags & ACPI_MADT_ENABLED) {
> num_processors++;
> set_cpu_present(cpu, true);
> __cpu_number_map[cpuid] = cpu;
> __cpu_logical_map[cpu] = cpuid;
> }
>
> The problem is that
> smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */
> <<<<<<<<<<<<<<<<
> set_cpu_numa_node() is called in function smp_prepare_boot_cpu()
>
> early_numa_node_init();
>
> static void __init early_numa_node_init(void)
> {
> #ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
> #ifndef cpu_to_node
> int cpu;
>
> /* The early_cpu_to_node() should be ready here. */
> for_each_possible_cpu(cpu)
> set_cpu_numa_node(cpu, early_cpu_to_node(cpu));
> <<<<<<<<<<<<<<<<
> * however here early_cpu_to_node is -1, so that cpu_to_node(cpu) will
> always return -1 in late. *, which causes cpu hotadd problem.
Still confused. For ACPI_MADT_ENABLED cpus, everything is right after
early_numa_node_init(). For !ACPI_MADT_ENABLED cpus, cpu_to_node()
returns -1 after early_numa_node_init() and before hot-add, but if
acpi_map_cpu() do things right, cpu_to_node() should still work well
after hot-add.
Huacai
>
> Regards
> Bibo Mao
>
>
> > set_cpuid_to_node(physid, nid);
> > node_set(nid, numa_nodes_parsed);
> > set_cpu_numa_node(cpu, nid);
> >
> > Huacai
> >
> >>
> >> if (physid < 0)
> >> return NUMA_NO_NODE;
> >>
> >> return __cpuid_to_node[physid];
> >> }
> >>
> >> Regards
> >> Bibo Mao
> >>>
> >>> Huacai
> >>>
> >>>>
> >>>> Here static array __cpu_to_node and api set_early_cpu_to_node()
> >>>> is added, so that early_cpu_to_node is consistent with function
> >>>> cpu_to_node() for hot-added cpu.
> >>>>
> >>>> Signed-off-by: Bibo Mao <maobibo@xxxxxxxxxxx>
> >>>> ---
> >>>> arch/loongarch/include/asm/numa.h | 2 ++
> >>>> arch/loongarch/kernel/numa.c | 10 +++++++++-
> >>>> arch/loongarch/kernel/smp.c | 1 +
> >>>> 3 files changed, 12 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/arch/loongarch/include/asm/numa.h b/arch/loongarch/include/asm/numa.h
> >>>> index b5f9de9f102e..e8e6fcfb006a 100644
> >>>> --- a/arch/loongarch/include/asm/numa.h
> >>>> +++ b/arch/loongarch/include/asm/numa.h
> >>>> @@ -50,6 +50,7 @@ static inline void set_cpuid_to_node(int cpuid, s16 node)
> >>>> }
> >>>>
> >>>> extern int early_cpu_to_node(int cpu);
> >>>> +extern void set_early_cpu_to_node(int cpu, s16 node);
> >>>>
> >>>> #else
> >>>>
> >>>> @@ -57,6 +58,7 @@ static inline void early_numa_add_cpu(int cpuid, s16 node) { }
> >>>> static inline void numa_add_cpu(unsigned int cpu) { }
> >>>> static inline void numa_remove_cpu(unsigned int cpu) { }
> >>>> static inline void set_cpuid_to_node(int cpuid, s16 node) { }
> >>>> +static inline void set_early_cpu_to_node(int cpu, s16 node) { }
> >>>>
> >>>> static inline int early_cpu_to_node(int cpu)
> >>>> {
> >>>> diff --git a/arch/loongarch/kernel/numa.c b/arch/loongarch/kernel/numa.c
> >>>> index 84fe7f854820..62508aace644 100644
> >>>> --- a/arch/loongarch/kernel/numa.c
> >>>> +++ b/arch/loongarch/kernel/numa.c
> >>>> @@ -34,6 +34,9 @@ static struct numa_meminfo numa_meminfo;
> >>>> cpumask_t cpus_on_node[MAX_NUMNODES];
> >>>> cpumask_t phys_cpus_on_node[MAX_NUMNODES];
> >>>> EXPORT_SYMBOL(cpus_on_node);
> >>>> +static s16 __cpu_to_node[NR_CPUS] = {
> >>>> + [0 ... CONFIG_NR_CPUS - 1] = NUMA_NO_NODE
> >>>> +};
> >>>>
> >>>> /*
> >>>> * apicid, cpu, node mappings
> >>>> @@ -117,11 +120,16 @@ int early_cpu_to_node(int cpu)
> >>>> int physid = cpu_logical_map(cpu);
> >>>>
> >>>> if (physid < 0)
> >>>> - return NUMA_NO_NODE;
> >>>> + return __cpu_to_node[cpu];
> >>>>
> >>>> return __cpuid_to_node[physid];
> >>>> }
> >>>>
> >>>> +void set_early_cpu_to_node(int cpu, s16 node)
> >>>> +{
> >>>> + __cpu_to_node[cpu] = node;
> >>>> +}
> >>>> +
> >>>> void __init early_numa_add_cpu(int cpuid, s16 node)
> >>>> {
> >>>> int cpu = __cpu_number_map[cpuid];
> >>>> diff --git a/arch/loongarch/kernel/smp.c b/arch/loongarch/kernel/smp.c
> >>>> index 9afc2d8b3414..998668be858c 100644
> >>>> --- a/arch/loongarch/kernel/smp.c
> >>>> +++ b/arch/loongarch/kernel/smp.c
> >>>> @@ -512,6 +512,7 @@ void __init smp_prepare_boot_cpu(void)
> >>>> set_cpu_numa_node(cpu, node);
> >>>> else {
> >>>> set_cpu_numa_node(cpu, rr_node);
> >>>> + set_early_cpu_to_node(cpu, rr_node);
> >>>> rr_node = next_node_in(rr_node, node_online_map);
> >>>> }
> >>>> }
> >>>>
> >>>> base-commit: 6485cf5ea253d40d507cd71253c9568c5470cd27
> >>>> --
> >>>> 2.39.3
> >>>>
> >>>>
> >>
> >>
>
>