Re: getting oom/stalls for ltp test cpuset01 with latest/4.9 kernel

From: Michal Hocko
Date: Fri Jan 13 2017 - 10:51:56 EST


On Fri 13-01-17 10:06:14, Vlastimil Babka wrote:
[...]
> >From 9f041839401681f2678edf5040c851d11963c5fe Mon Sep 17 00:00:00 2001
> From: Vlastimil Babka <vbabka@xxxxxxx>
> Date: Fri, 13 Jan 2017 10:01:26 +0100
> Subject: [PATCH] mm, page_alloc: fix race with cpuset update or removal
>
> Changelog and S-O-B TBD.
> ---
> mm/page_alloc.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 6de9440e3ae2..c397f146843a 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3775,9 +3775,17 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
> /*
> * Restore the original nodemask if it was potentially replaced with
> * &cpuset_current_mems_allowed to optimize the fast-path attempt.
> + * Also recalculate the starting point for the zonelist iterator or
> + * we could end up iterating over non-eligible zones endlessly.
> */
> - if (cpusets_enabled())
> + if (unlikely(ac.nodemask != nodemask)) {
> ac.nodemask = nodemask;
> + ac.preferred_zoneref = first_zones_zonelist(ac.zonelist,
> + ac.high_zoneidx, ac.nodemask);
> + if (!ac.preferred_zoneref)
> + goto no_zone;
> + }
> +
> page = __alloc_pages_slowpath(alloc_mask, order, &ac);

I think you nailed it. It is really possible that preferred_zoneref is
outside of the cpuset_current_mems_allowed and if we are unlucky there
won't be any other zones on the zonelist...

--
Michal Hocko
SUSE Labs