Re: [RFC 4/4] mm: Ignore cpuset enforcement when allocation flag has __GFP_THISNODE

From: Dave Hansen
Date: Tue Nov 29 2016 - 11:52:24 EST


On 11/28/2016 10:51 PM, Anshuman Khandual wrote:
> On 11/29/2016 02:42 AM, Dave Hansen wrote:
>> > On 11/22/2016 06:19 AM, Anshuman Khandual wrote:
>>> >> --- a/mm/page_alloc.c
>>> >> +++ b/mm/page_alloc.c
>>> >> @@ -3715,7 +3715,7 @@ struct page *
>>> >> .migratetype = gfpflags_to_migratetype(gfp_mask),
>>> >> };
>>> >>
>>> >> - if (cpusets_enabled()) {
>>> >> + if (cpusets_enabled() && !(alloc_mask & __GFP_THISNODE)) {
>>> >> alloc_mask |= __GFP_HARDWALL;
>>> >> alloc_flags |= ALLOC_CPUSET;
>>> >> if (!ac.nodemask)
>> >
>> > This means now that any __GFP_THISNODE allocation can "escape" the
>> > cpuset. That seems like a pretty major change to how cpusets works. Do
>> > we know that *ALL* __GFP_THISNODE allocations are truly lacking in a
>> > cpuset context that can be enforced?
> Right, I know its a very blunt change. With the cpuset based isolation
> of coherent device node for the user space tasks leads to a side effect
> that a driver or even kernel cannot allocate memory from the coherent
...

Well, we have __GFP_HARDWALL:

* __GFP_HARDWALL enforces the cpuset memory allocation policy.

which you can clear in the places where you want to do an allocation but
want to ignore cpusets. But, __cpuset_node_allowed() looks like it gets
a little funky if you do that since it would probably be falling back to
the root cpuset that also would not have the new node in mems_allowed.

What exactly are the kernel-internal places that need to allocate from
the coherent device node? When would this be done out of the context of
an application *asking* for memory in the new node?