Re: MPOL_BIND on memory only nodes

From: Anshuman Khandual
Date: Thu Oct 13 2016 - 08:28:23 EST

On 10/13/2016 03:37 PM, Michal Hocko wrote:
> On Thu 13-10-16 15:24:54, Anshuman Khandual wrote:
> [...]
>> Which makes the function look like this. Even with these changes, MPOL_BIND is
>> still going to pick up the local node's zonelist instead of the first node in
>> policy->v.nodes nodemask. It completely ignores policy->v.nodes which it should
>> not.
> Not really. I have tried to explain earlier. We do not ignore policy
> nodemask. This one comes from policy_nodemask. We start with the local
> node but fallback to some of the nodes from the nodemask defined by the
> policy.

Yeah saw your response but did not get that exactly. We dont ignore
policy nodemask while memory allocation, correct. But my point was
we are ignoring policy nodemask while selecting zonelist which will
be used during page allocation. Though the zone contents of both the
zonelists are likely to be same, would not it be better to get the
zone list from the nodemask as well ? Or I am still missing something
here. The following change is what I am trying to propose.

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index ad1c96a..f60ab80 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1685,14 +1685,7 @@ static struct zonelist *policy_zonelist(gfp_t gfp, struct mempolicy *policy,
nd = policy->v.preferred_node;
- /*
- * Normally, MPOL_BIND allocations are node-local within the
- * allowed nodemask. However, if __GFP_THISNODE is set and the
- * current node isn't part of the mask, we use the zonelist for
- * the first node in the mask instead.
- */
- if (unlikely(gfp & __GFP_THISNODE) &&
- unlikely(!node_isset(nd, policy->v.nodes)))
+ if (unlikely(!node_isset(nd, policy->v.nodes)))
nd = first_node(policy->v.nodes);