Re: [PATCH 1/2] mm/migrate: optimize hotplug-time demotion order updates

From: Dave Hansen
Date: Mon Sep 20 2021 - 17:41:44 EST


On 9/17/21 5:55 PM, Huang, Ying wrote:
>> @@ -3147,6 +3177,16 @@ static void __set_migration_target_nodes
>> int node;
>>
>> /*
>> + * The "migration path" array is heavily optimized
>> + * for reads. This is the write side which incurs a
>> + * very heavy synchronize_rcu(). Avoid this overhead
>> + * when nothing of consequence has changed since the
>> + * last write.
>> + */
>> + if (!node_demotion_topo_changed())
>> + return;
>> +
>> + /*
>> * Avoid any oddities like cycles that could occur
>> * from changes in the topology. This will leave
>> * a momentary gap when migration is disabled.
> Now synchronize_rcu() is called in disable_all_migrate_targets(), which
> is called for MEM_GOING_OFFLINE. Can we remove the synchronize_rcu()
> from disable_all_migrate_targets() and call it in
> __set_migration_target_nodes() before we update the node_demotion[]?

I see what you are saying. This patch just targeted
__set_migration_target_nodes() which is called in for
MEM_ONLINE/OFFLINE. But, it missed MEM_GOING_OFFLINE's call to
disable_all_migrate_targets().

I think I found something better than what I had in this patch, or the
tweak you suggested: The 'memory_notify->status_change_nid' field is
passed to all memory hotplug notifiers and tells us whether the node is
going online/offline. Instead of trying to track the changes, I think
we can simply rely on it to tell us when a node is going online/offline.

This removes the need for the demotion code to track *any* state. I've
attached a totally untested patch to do this.
diff -puN mm/migrate.c~faster-node-order mm/migrate.c
--- a/mm/migrate.c~faster-node-order 2021-09-17 14:44:58.697476940 -0700
+++ b/mm/migrate.c 2021-09-20 14:31:43.570477095 -0700
@@ -3239,8 +3239,18 @@ static int migration_offline_cpu(unsigne
* set_migration_target_nodes().
*/
static int __meminit migrate_on_reclaim_callback(struct notifier_block *self,
- unsigned long action, void *arg)
+ unsigned long action, void *_arg)
{
+ struct memory_notify *arg = _arg;
+
+ /*
+ * Only update the node migration order when a node is
+ * changing status, like online->offline. This avoids
+ * the overhead of synchronize_rcu() in most cases.
+ */
+ if (arg->status_change_nid < 0)
+ return notifier_from_errno(0);
+
switch (action) {
case MEM_GOING_OFFLINE:
/*
_