Re: [PATCH v9 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
From: Yicong Yang
Date: Tue Nov 19 2024 - 07:27:51 EST
On 2024/11/18 23:04, Dietmar Eggemann wrote:
> On 18/11/2024 11:50, Yicong Yang wrote:
>> On 2024/11/15 17:42, Pierre Gondois wrote:
>>> Hello Yicong,
>>>
>>>
>>> On 11/14/24 15:11, Yicong Yang wrote:
>>>> From: Yicong Yang <yangyicong@xxxxxxxxxxxxx>
>
> [...]
>
>>>> diff --git a/include/linux/topology.h b/include/linux/topology.h
>>>> index 52f5850730b3..b8e860276518 100644
>>>> --- a/include/linux/topology.h
>>>> +++ b/include/linux/topology.h
>>>> @@ -240,6 +240,26 @@ static inline const struct cpumask *cpu_smt_mask(int cpu)
>>>> }
>>>> #endif
>>>> +#ifndef topology_is_primary_thread
>>>> +
>>>> +#define topology_is_primary_thread topology_is_primary_thread
>>>> +
>>>> +static inline bool topology_is_primary_thread(unsigned int cpu)
>>>> +{
>>>> + /*
>>>> + * On SMT hotplug the primary thread of the SMT won't be disabled.
>>>> + * Architectures do have a special primary thread (e.g. x86) need
>>>> + * to override this function. Otherwise just make the first thread
>>>> + * in the SMT as the primary thread.
>>>> + *
>>>> + * The sibling cpumask of an offline CPU contains always the CPU
>>>> + * itself.
>>>
>>> As Thomas suggested, would it be possible to check it for other
>>> architectures ?
>>> For instance for loongarch at arch/loongarch/kernel/smp.c,
>>> clear_cpu_sibling_map() seems to completely clear &cpu_sibling_map[cpu]
>>> when a CPU is put offline. This would make topology_sibling_cpumask(cpu)
>>> to be empty and cpu_bootable() return false if the CPU never booted before.
>>>
>>
>> cpu_bootable() only affects architectures select HOTPLUG_SMT, otherwise it'll always
>> return true. Since x86 and powerpc have their own illustration of primary thread and
>> have an override version of this funciton, arm64 is the only user now by this patchset.
>> We have this guarantee for arm64 and also for other architectures using arch_topology.c
>> (see clear_cpu_topology()). So if loogarch has a different implementation, they
>> should implement a topology_is_primary_thread() variant to support HOTPLUG_SMT.
>
> I also stumbled over this sentence.
>
> drivers/base/arch_topology.c:
>
> void clear_cpu_topology(int cpu) (2)
>
> ...
> cpumask_set_cpu(cpu, &cpu_topo->thread_sibling) (4)
>
> void __init reset_cpu_topology(void) (1)
>
> for_each_possible_cpu(cpu)
>
> ...
> clear_cpu_topology(cpu) (2)
>
> #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV) (3)
> void __init init_cpu_topology(void)
>
> reset_cpu_topology() (1)
> ...
>
> Does this mean the default implementation relies on (4), i.e. is only
> valid for arm64 and riscv? (3)
> Do all the other archs then have to overwrite the default implementation
> (like x86 and powerpc) if they want to implement CONFIG_HOTPLUG_SMT?
>
I think yes if they have problems with the default implementation. That's
what used to be to support HOTPLUG_SMT before this, each arthitecture
provides their own variant of topology_is_primary_thread.
The current approach may also work since cpu_bootable() will make all the
CPUs boot at least once:
static inline bool cpu_bootable(unsigned int cpu) {
[...]
if (topology_is_primary_thread(cpu))
return true;
return !cpumask_test_cpu(cpu, &cpus_booted_once_mask); // True if not booted
}
The boot process will be like below. cpu_bootable() is checked twice:
-> cpu_up()
cpu_bootable() (1)
[...]
cpuhp_bringup_ap()
[ archs usually update the sibling_mask in start_secondary() here ]
bringup_wait_for_ap_online()
if (!cpu_bootable(cpu)) (2)
return -ECANCELED // roll back and offline this CPU
So an empty mask in (1) won't block the CPU going online. And the default
topology_is_primary_thread() should work if sibling mask updated before
the checking in (2). I hacked x86 to use the default topology_is_primary_thread
and tested on a qemu vm and it seems also work (just for test since the
primary thread should not always be the 1st thread in the core on x86,
not quite sure).
Thanks.