Re: MPOL_BIND on memory only nodes

From: Mel Gorman
Date: Thu Oct 13 2016 - 06:44:40 EST


On Wed, Oct 12, 2016 at 03:16:27PM +0200, Michal Hocko wrote:
> On Wed 12-10-16 11:43:37, Michal Hocko wrote:
> > On Wed 12-10-16 14:55:24, Anshuman Khandual wrote:
> [...]
> > > Why we insist on __GFP_THISNODE ?
> >
> > AFAIU __GFP_THISNODE just overrides the given node to the policy
> > nodemask in case the current node is not part of that node mask. In
> > other words we are ignoring the given node and use what the policy says.
> > I can see how this can be confusing especially when confronting the
> > documentation:
> >
> > * __GFP_THISNODE forces the allocation to be satisified from the requested
> > * node with no fallbacks or placement policy enforcements.
>
> You made me think and look into this deeper. I came to the conclusion
> that this is actually a relict from the past. policy_zonelist is called
> only from 3 places:
> - huge_zonelist - never should do __GFP_THISNODE when going this path
> - alloc_pages_vma - which shouldn't depend on __GFP_THISNODE either
> - alloc_pages_current - which uses default_policy id __GFP_THISNODE is
> used
>
> So AFAICS this is essentially a dead code or I am missing something. Mel
> do you remember why we needed it in the past?

I don't recall a specific reason. It was likely due to confusion on my
part at the time on the exact use of __GFP_THISNODE. The expectation is
that flag is not used in fault paths or with policies. It's meant to
enforce node-locality for kernel internal decisions such as the locality
of slab pages and ensuring that a THP collapse from khugepaged is on the
same node.

--
Mel Gorman
SUSE Labs