Re: [PATCH 7/9] mm, page_alloc: remove stop_machine from build_all_zonelists

From: Mel Gorman
Date: Fri Jul 14 2017 - 08:47:51 EST


On Fri, Jul 14, 2017 at 01:00:25PM +0200, Michal Hocko wrote:
> On Fri 14-07-17 10:59:32, Mel Gorman wrote:
> > On Fri, Jul 14, 2017 at 10:00:04AM +0200, Michal Hocko wrote:
> > > From: Michal Hocko <mhocko@xxxxxxxx>
> > >
> > > build_all_zonelists has been (ab)using stop_machine to make sure that
> > > zonelists do not change while somebody is looking at them. This is
> > > is just a gross hack because a) it complicates the context from which
> > > we can call build_all_zonelists (see 3f906ba23689 ("mm/memory-hotplug:
> > > switch locking to a percpu rwsem")) and b) is is not really necessary
> > > especially after "mm, page_alloc: simplify zonelist initialization".
> > >
> > > Updates of the zonelists happen very seldom, basically only when a zone
> > > becomes populated during memory online or when it loses all the memory
> > > during offline. A racing iteration over zonelists could either miss a
> > > zone or try to work on one zone twice. Both of these are something we
> > > can live with occasionally because there will always be at least one
> > > zone visible so we are not likely to fail allocation too easily for
> > > example.
> > >
> > > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
> >
> > This patch is contingent on the last patch which updates in place
> > instead of zeroing the early part of the zonelist first but needs to fix
> > the stack usage issues. I think it's also worth pointing out in the
> > changelog that stop_machine never gave the guarantees it claimed as a
> > process iterating through the zonelist can be stopped so when it resumes
> > the zonelist has changed underneath it. Doing it online is roughly
> > equivalent in terms of safety.
>
> OK, what about the following addendum?
> "
> Please note that the original stop_machine approach doesn't really
> provide a better exclusion because the iteration might be interrupted
> half way (unless the whole iteration is preempt disabled which is not the
> case in most cases) so the some zones could still be seen twice or a
> zone missed.
> "

Works for me.

--
Mel Gorman
SUSE Labs