Re: [PATCH 7/9] mm, page_alloc: remove stop_machine from build_all_zonelists
From: Mel Gorman
Date: Fri Jul 14 2017 - 05:59:45 EST
On Fri, Jul 14, 2017 at 10:00:04AM +0200, Michal Hocko wrote:
> From: Michal Hocko <mhocko@xxxxxxxx>
>
> build_all_zonelists has been (ab)using stop_machine to make sure that
> zonelists do not change while somebody is looking at them. This is
> is just a gross hack because a) it complicates the context from which
> we can call build_all_zonelists (see 3f906ba23689 ("mm/memory-hotplug:
> switch locking to a percpu rwsem")) and b) is is not really necessary
> especially after "mm, page_alloc: simplify zonelist initialization".
>
> Updates of the zonelists happen very seldom, basically only when a zone
> becomes populated during memory online or when it loses all the memory
> during offline. A racing iteration over zonelists could either miss a
> zone or try to work on one zone twice. Both of these are something we
> can live with occasionally because there will always be at least one
> zone visible so we are not likely to fail allocation too easily for
> example.
>
> Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
This patch is contingent on the last patch which updates in place
instead of zeroing the early part of the zonelist first but needs to fix
the stack usage issues. I think it's also worth pointing out in the
changelog that stop_machine never gave the guarantees it claimed as a
process iterating through the zonelist can be stopped so when it resumes
the zonelist has changed underneath it. Doing it online is roughly
equivalent in terms of safety.
--
Mel Gorman
SUSE Labs