Re: [PATCH] page_alloc: skip cpuset enforcement for lower zone allocations

From: Christoph Lameter
Date: Tue May 27 2014 - 11:32:09 EST


On Tue, 27 May 2014, Marcelo Tosatti wrote:

> >
> > Memory policies are only applied to a specific zone so this is not
> > unprecedented. However, if a user wants to limit allocation to a specific
> > node and there is no DMA memory there then may be that is a operator
> > error? After all the application will be using memory from a node that the
> > operator explicitly wanted not to be used.
>
> Ok here is the use-case:
>
> - machine contains driver which requires zone specific memory (such as
> KVM, which requires root pagetable at paddr < 4GB).

GFP_KERNEL is used for page tables.

>
> * The second pass through get_page_from_freelist() doesn't even call
> * here for GFP_ATOMIC calls. For those calls, the __alloc_pages()
> * variable 'wait' is not set, and the bit ALLOC_CPUSET is not set
> * in alloc_flags. That logic and the checks below have the combined
> * affect that:
> * in_interrupt - any node ok (current task context irrelevant)
> * GFP_ATOMIC - any node ok
> * TIF_MEMDIE - any node ok
> * GFP_KERNEL - any node in enclosing hardwalled cpuset ok

Page table allocations are GFP_KERNEL allocations. So the above use case
is ok if you switch off the hardwall flag in the cpuset.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/