On Wed, 5 Feb 2014, Nathan Zimmer wrote:
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.cThat looks a little problematic, what happens if a nid is being brought
index 62a0cd1..a3cbd14 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -985,12 +985,12 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
if (need_zonelists_rebuild)
zone_pcp_reset(zone);
mutex_unlock(&zonelists_mutex);
+ unlock_memory_hotplug();
printk(KERN_DEBUG "online_pages [mem %#010llx-%#010llx] failed\n",
(unsigned long long) pfn << PAGE_SHIFT,
(((unsigned long long) pfn + nr_pages)
<< PAGE_SHIFT) - 1);
memory_notify(MEM_CANCEL_ONLINE, &arg);
- unlock_memory_hotplug();
return ret;
}
@@ -1016,9 +1016,10 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
writeback_set_ratelimit();
+ unlock_memory_hotplug();
+
if (onlined_pages)
memory_notify(MEM_ONLINE, &arg);
- unlock_memory_hotplug();
return 0;
}
online and a registered callback does something like allocate resources
for the arg->status_change_nid and the above two hunks of this patch end
up racing?
Before, a registered callback would be guaranteed to see either a
MEMORY_CANCEL_ONLINE or MEMORY_ONLINE after it has already done
MEMORY_GOING_ONLINE.
With your patch, we could race and see one cpu doing MEMORY_GOING_ONLINE,
another cpu doing MEMORY_GOING_ONLINE, and then MEMORY_ONLINE and
MEMORY_CANCEL_ONLINE in either order.
So I think this patch will break most registered callbacks that actually
depend on lock_memory_hotplug(), it's a coarse lock for that reason.