Re: [PATCH 1/2] mm/migrate: optimize hotplug-time demotion order updates

From: Huang, Ying
Date: Tue Sep 21 2021 - 10:36:52 EST


Dave Hansen <dave.hansen@xxxxxxxxx> writes:

> On 9/17/21 5:55 PM, Huang, Ying wrote:
>>> @@ -3147,6 +3177,16 @@ static void __set_migration_target_nodes
>>> int node;
>>>
>>> /*
>>> + * The "migration path" array is heavily optimized
>>> + * for reads. This is the write side which incurs a
>>> + * very heavy synchronize_rcu(). Avoid this overhead
>>> + * when nothing of consequence has changed since the
>>> + * last write.
>>> + */
>>> + if (!node_demotion_topo_changed())
>>> + return;
>>> +
>>> + /*
>>> * Avoid any oddities like cycles that could occur
>>> * from changes in the topology. This will leave
>>> * a momentary gap when migration is disabled.
>> Now synchronize_rcu() is called in disable_all_migrate_targets(), which
>> is called for MEM_GOING_OFFLINE. Can we remove the synchronize_rcu()
>> from disable_all_migrate_targets() and call it in
>> __set_migration_target_nodes() before we update the node_demotion[]?
>
> I see what you are saying. This patch just targeted
> __set_migration_target_nodes() which is called in for
> MEM_ONLINE/OFFLINE. But, it missed MEM_GOING_OFFLINE's call to
> disable_all_migrate_targets().
>
> I think I found something better than what I had in this patch, or the
> tweak you suggested: The 'memory_notify->status_change_nid' field is
> passed to all memory hotplug notifiers and tells us whether the node is
> going online/offline. Instead of trying to track the changes, I think
> we can simply rely on it to tell us when a node is going online/offline.
>
> This removes the need for the demotion code to track *any* state. I've
> attached a totally untested patch to do this.

Yes. This sounds good. I will try to test this patch on my side.

>From another point of view, we still need to update demotion order upon
CPU hotplug too, because whether a node has CPU may be changed there.
And we need a solution for that too.

Best Regards,
Huang, Ying