Re: [PATCH 6/9] mm, page_alloc: simplify zonelist initialization
From: Vlastimil Babka
Date: Thu Jul 20 2017 - 02:55:49 EST
On 07/14/2017 10:00 AM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@xxxxxxxx>
>
> build_zonelists gradually builds zonelists from the nearest to the most
> distant node. As we do not know how many populated zones we will have in
> each node we rely on the _zoneref to terminate initialized part of the
> zonelist by a NULL zone. While this is functionally correct it is quite
> suboptimal because we cannot allow updaters to race with zonelists
> users because they could see an empty zonelist and fail the allocation
> or hit the OOM killer in the worst case.
>
> We can do much better, though. We can store the node ordering into an
> already existing node_order array and then give this array to
> build_zonelists_in_node_order and do the whole initialization at once.
> zonelists consumers still might see halfway initialized state but that
> should be much more tolerateable because the list will not be empty and
> they would either see some zone twice or skip over some zone(s) in the
> worst case which shouldn't lead to immediate failures.
>
> This patch alone doesn't introduce any functional change yet, though, it
> is merely a preparatory work for later changes.
>
> Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
I've collected the fold-ups from this thread and looked at the result as
single patch. Sems OK, just two things:
- please rename variable "i" in build_zonelists() to e.g. "nr_nodes"
- the !CONFIG_NUMA variant of build_zonelists() won't build, because it
doesn't declare nr_zones variable