Re: [BUGFIX][PATCH] oom-kill: fix NUMA consraint check with nodemaskv3

From: David Rientjes
Date: Tue Nov 10 2009 - 22:14:40 EST


On Wed, 11 Nov 2009, KOSAKI Motohiro wrote:

> > > {
> > > -#ifdef CONFIG_NUMA
> > > struct zone *zone;
> > > struct zoneref *z;
> > > enum zone_type high_zoneidx = gfp_zone(gfp_mask);
> > > - nodemask_t nodes = node_states[N_HIGH_MEMORY];
> > > + int ret = CONSTRAINT_NONE;
> > >
> > > - for_each_zone_zonelist(zone, z, zonelist, high_zoneidx)
> > > - if (cpuset_zone_allowed_softwall(zone, gfp_mask))
> > > - node_clear(zone_to_nid(zone), nodes);
> > > - else
> > > + /*
> > > + * The nodemask here is a nodemask passed to alloc_pages(). Now,
> > > + * cpuset doesn't use this nodemask for its hardwall/softwall/hierarchy
> > > + * feature. mempolicy is an only user of nodemask here.
> > > + */
> > > + if (nodemask) {
> > > + nodemask_t mask;
> > > + /* check mempolicy's nodemask contains all N_HIGH_MEMORY */
> > > + nodes_and(mask, *nodemask, node_states[N_HIGH_MEMORY]);
> > > + if (!nodes_equal(mask, node_states[N_HIGH_MEMORY]))
> > > + return CONSTRAINT_MEMORY_POLICY;
> > > + }
> >
> > Although a nodemask_t was previously allocated on the stack, we should
> > probably change this to use NODEMASK_ALLOC() for kernels with higher
> > CONFIG_NODES_SHIFT since allocations can happen very deep into the stack.
>
> No. NODEMASK_ALLOC() is crap. we should remove it.

I've booted 1K node systems and have found it to be helpful to ensure that
the stack will not overflow especially in areas where we normally are deep
already, such as in the page allocator.

> btw, CPUMASK_ALLOC was already removed.

I don't remember CPUMASK_ALLOC() actually being merged. I know the
comment exists in nodemask.h, but I don't recall any CPUMASK_ALLOC() users
in the tree.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/