Re: [PATCH 1/2] mm/migrate: optimize hotplug-time demotion order updates

From: David Hildenbrand
Date: Tue Sep 21 2021 - 03:23:27 EST


On 20.09.21 23:37, Dave Hansen wrote:
On 9/17/21 5:55 PM, Huang, Ying wrote:
@@ -3147,6 +3177,16 @@ static void __set_migration_target_nodes
int node;
/*
+ * The "migration path" array is heavily optimized
+ * for reads. This is the write side which incurs a
+ * very heavy synchronize_rcu(). Avoid this overhead
+ * when nothing of consequence has changed since the
+ * last write.
+ */
+ if (!node_demotion_topo_changed())
+ return;
+
+ /*
* Avoid any oddities like cycles that could occur
* from changes in the topology. This will leave
* a momentary gap when migration is disabled.
Now synchronize_rcu() is called in disable_all_migrate_targets(), which
is called for MEM_GOING_OFFLINE. Can we remove the synchronize_rcu()
from disable_all_migrate_targets() and call it in
__set_migration_target_nodes() before we update the node_demotion[]?

I see what you are saying. This patch just targeted
__set_migration_target_nodes() which is called in for
MEM_ONLINE/OFFLINE. But, it missed MEM_GOING_OFFLINE's call to
disable_all_migrate_targets().

I think I found something better than what I had in this patch, or the
tweak you suggested: The 'memory_notify->status_change_nid' field is
passed to all memory hotplug notifiers and tells us whether the node is
going online/offline. Instead of trying to track the changes, I think
we can simply rely on it to tell us when a node is going online/offline.

This removes the need for the demotion code to track *any* state. I've
attached a totally untested patch to do this.


Sounds sane to me (although I really detest that status_change_nid... interface).

I was just about to ask "but how does this interact with !CONFIG_NUMA" ... until I realized that having a single node go completely offline is rather unrealistic ;)

--
Thanks,

David / dhildenb