Re: [PATCH v1] mm/memory_hotplug: Don't take the cpu_hotplug_lock

From: Qian Cai
Date: Thu Sep 26 2019 - 07:19:32 EST




> On Sep 26, 2019, at 3:26 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>
> OK, this is using for_each_online_cpu but why is this a problem? Have
> you checked what the code actually does? Let's say that online_pages is
> racing with cpu hotplug. A new CPU appears/disappears from the online
> mask while we are iterating it, right? Let's start with cpu offlining
> case. We have two choices, either the cpu is still visible and we update
> its local node configuration even though it will disappear shortly which
> is ok because we are not touching any data that disappears (it's all
> per-cpu). Case when the cpu is no longer there is not really
> interesting. For the online case we might miss a cpu but that should be
> tolerateable because that is not any different from triggering the
> online independently of the memory hotplug. So there has to be a hook
> from that code path as well. If there is none then this is buggy
> irrespective of the locking.
>
> Makes sense?

This sounds to me requires lots of audits and testing. Also, someone who is more
familiar with CPU hotplug should review this patch. Personally, I am no fun of
operating on an incorrect CPU mask to begin with, things could go wrong really
quickly...