Re: [PATCH] page_alloc: skip cpuset enforcement for lower zone allocations (v4)

From: Andrew Morton
Date: Thu May 29 2014 - 18:40:39 EST


On Thu, 29 May 2014 15:43:03 -0300 Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote:

>
> Zone specific allocations, such as GFP_DMA32, should not be restricted
> to cpusets allowed node list: the zones which such allocations demand
> might be contained in particular nodes outside the cpuset node list.
>
> Necessary for the following usecase:
> - driver which requires zone specific memory (such as KVM, which
> requires root pagetable at paddr < 4GB).
> - user wants to limit allocations of application to nodeX, and nodeX has
> no memory < 4GB.
>
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -2392,6 +2393,10 @@ int __cpuset_node_allowed_softwall(int node, gfp_t gfp_mask)
>
> if (in_interrupt() || (gfp_mask & __GFP_THISNODE))
> return 1;
> +#ifdef CONFIG_NUMA
> + if (gfp_zone(gfp_mask) < policy_zone)
> + return 1;
> +#endif

It's not very obvious why this code is doing what it does, so I'm
thinking a comment is needed. And that changelog text looks good, so

--- a/kernel/cpuset.c~page_alloc-skip-cpuset-enforcement-for-lower-zone-allocations-v4-fix
+++ a/kernel/cpuset.c
@@ -2388,6 +2388,11 @@ int __cpuset_node_allowed_softwall(int n
if (in_interrupt() || (gfp_mask & __GFP_THISNODE))
return 1;
#ifdef CONFIG_NUMA
+ /*
+ * Zone specific allocations such as GFP_DMA32 should not be restricted
+ * to cpusets allowed node list: the zones which such allocations
+ * demand be contained in particular nodes outside the cpuset node list
+ */
if (gfp_zone(gfp_mask) < policy_zone)
return 1;
#endif
--- a/mm/page_alloc.c~page_alloc-skip-cpuset-enforcement-for-lower-zone-allocations-v4-fix
+++ a/mm/page_alloc.c
@@ -2742,6 +2742,11 @@ retry_cpuset:
cpuset_mems_cookie = read_mems_allowed_begin();

#ifdef CONFIG_NUMA
+ /*
+ * Zone specific allocations such as GFP_DMA32 should not be restricted
+ * to cpusets allowed node list: the zones which such allocations
+ * demand be contained in particular nodes outside the cpuset node list
+ */
if (gfp_zone(gfp_mask) < policy_zone)
nodemask = &node_states[N_ONLINE];
#endif



However perhaps it would be nicer to do



#ifdef CONFIG_NUMA
/*
* Zone specific allocations such as GFP_DMA32 should not be restricted to
* cpusets allowed node list: the zones which such allocations demand be
* contained in particular nodes outside the cpuset node list
*/
static inline bool i_cant_think_of_a_name(gfp_t mask)
{
return gfp_zone(gfp_mask) < policy_zone;
}
#else
static inline bool i_cant_think_of_a_name(gfp_t mask)
{
return false;
}
#endif

This encapsulates it all in a single place and zaps those ifdefs?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/