Re: [PATCH] mm/migrate: fix CPUHP state to update node demotion order

From: Huang, Ying
Date: Sat Sep 18 2021 - 02:57:06 EST


Mika Penttilä <mika.penttila@xxxxxxxxxxxx> writes:

> Hi!
>
> On 18.9.2021 5.58, Huang Ying wrote:
>> The node demotion order needs to be updated during CPU hotplug.
>> Because whether a NUMA node has CPU may influence the demotion order.
>> The update function should be called during CPU online/offline after
>> the node_states[N_CPU] has been updated. That is done in
>> CPUHP_AP_ONLINE_DYN during CPU online and in CPUHP_MM_VMSTAT_DEAD
>> during CPU offline. But in commit 884a6e5d1f93 ("mm/migrate: update
>> node demotion order on hotplug events"), the function to update node
>> demotion order is called in CPUHP_AP_ONLINE_DYN during CPU
>> online/offline. This doesn't satisfy the order requirement. So in
>> this patch, we added CPUHP_AP_MM_DEMOTION_ONLINE and
>> CPUHP_MM_DEMOTION_OFFLINE to be called after CPUHP_AP_ONLINE_DYN and
>> CPUHP_MM_VMSTAT_DEAD during CPU online/offline, and register the
>> update function on them.
>>
>> Fixes: 884a6e5d1f93 ("mm/migrate: update node demotion order on hotplug events")
>> Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx>
>> Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
>> Cc: Yang Shi <shy828301@xxxxxxxxx>
>> Cc: Zi Yan <ziy@xxxxxxxxxx>
>> Cc: Michal Hocko <mhocko@xxxxxxxx>
>> Cc: Wei Xu <weixugc@xxxxxxxxxx>
>> Cc: Oscar Salvador <osalvador@xxxxxxx>
>> Cc: David Rientjes <rientjes@xxxxxxxxxx>
>> Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
>> Cc: David Hildenbrand <david@xxxxxxxxxx>
>> Cc: Greg Thelen <gthelen@xxxxxxxxxx>
>> Cc: Keith Busch <kbusch@xxxxxxxxxx>
>> ---
>> include/linux/cpuhotplug.h | 2 ++
>> mm/migrate.c | 8 +++++---
>> 2 files changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
>> index 832d8a74fa59..5a92ea56f21b 100644
>> --- a/include/linux/cpuhotplug.h
>> +++ b/include/linux/cpuhotplug.h
>> @@ -72,6 +72,7 @@ enum cpuhp_state {
>> CPUHP_SLUB_DEAD,
>> CPUHP_DEBUG_OBJ_DEAD,
>> CPUHP_MM_WRITEBACK_DEAD,
>> + CPUHP_MM_DEMOTION_OFFLINE,
>> CPUHP_MM_VMSTAT_DEAD,
>> CPUHP_SOFTIRQ_DEAD,
>> CPUHP_NET_MVNETA_DEAD,
>> @@ -240,6 +241,7 @@ enum cpuhp_state {
>> CPUHP_AP_BASE_CACHEINFO_ONLINE,
>> CPUHP_AP_ONLINE_DYN,
>> CPUHP_AP_ONLINE_DYN_END = CPUHP_AP_ONLINE_DYN + 30,
>> + CPUHP_AP_MM_DEMOTION_ONLINE,
>> CPUHP_AP_X86_HPET_ONLINE,
>> CPUHP_AP_X86_KVM_CLK_ONLINE,
>> CPUHP_AP_DTPM_CPU_ONLINE,
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index a6a7743ee98f..77d107a4577f 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -3278,9 +3278,8 @@ static int __init migrate_on_reclaim_init(void)
>> {
>> int ret;
>> - ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "migrate on
>> reclaim",
>> - migration_online_cpu,
>> - migration_offline_cpu);
>> + ret = cpuhp_setup_state_nocalls(CPUHP_MM_DEMOTION_OFFLINE, "mm/demotion:offline",
>> + NULL, migration_offline_cpu);
>> /*
>> * In the unlikely case that this fails, the automatic
>> * migration targets may become suboptimal for nodes
>> @@ -3288,6 +3287,9 @@ static int __init migrate_on_reclaim_init(void)
>> * rare case, do not bother trying to do anything special.
>> */
>> WARN_ON(ret < 0);
>> + ret = cpuhp_setup_state_nocalls(CPUHP_AP_MM_DEMOTION_ONLINE, "mm/demotion:online",
>> + migration_online_cpu, NULL);
>>
>
> You changed to _nocalls variant, how does this handle initialization
> for cpus present at boot?

You are right! Thanks!

There are some discussion about CPUHUP in anther thread as follows,

https://lore.kernel.org/lkml/CAAPL-u_Tig1jK=mv_r=j-A-hR3Kpu7txiSFbPR3a8O1qhM1s-Q@xxxxxxxxxxxxxx/

I will wait for discussion in that thread too before the next step.

Best Regards,
Huang, Ying