Re: [patch 2/2] cpusets: add interleave_over_allowed option

From: Lee Schermerhorn
Date: Fri Oct 26 2007 - 17:14:28 EST


On Fri, 2007-10-26 at 13:45 -0700, David Rientjes wrote:
> On Fri, 26 Oct 2007, Paul Jackson wrote:
>
> > Without at least this sort of change to MPOL_INTERLEAVE nodemasks,
> > allowing either empty nodemasks (Lee's proposal) or extending them
> > outside the current cpuset (what I'm cooking up now), there is no way
> > for a task that is currently confined to a single node cpuset to say
> > anything about how it wants be interleaved in the event that it is
> > subsequently moved to a larger cpuset. Currently, such a task is only
> > allowed to pass exactly one particular nodemask to set_mempolicy
> > MPOL_INTERLEAVE calls, with exactly the one bit corresponding to its
> > current node. No useful information can be passed via an API that only
> > allows a single legal value.
> >
>
> Well, passing a single node to set_mempolicy() for MPOL_INTERLEAVE doesn't
> make a whole lot of sense in the first place. I prefer your solution of
> allowing set_mempolicy(MPOL_INTERLEAVE, NODE_MASK_ALL) to mean "interleave
> me over everything I'm allowed to access." NODE_MASK_ALL would be stored
> in the struct mempolicy and used later on mpol_rebind_policy().

You don't need to save the entire mask--just note that NODE_MASK_ALL was
passed--like with my internal MPOL_CONTEXT flag. This would involve
special casing NODE_MASK_ALL in the error checking, as currently
set_mempolicy() complains loudly if you pass non-allowed nodes--see
"contextualize_policy()". [mbind() on the other hand, appears to allow
any nodemask, even outside the cpuset. guess we catch this during
allocation.] This is pretty much the spirit of my patch w/o the API
change/extension [/improvement :)]

For some systems [not mine], the nodemasks can get quite large. I have
a patch, that I've tested atop Mel Gorman's "onezonelist" patches that
replaces the nodemasks embedded in struct mempolicy with pointers to
dynamically allocated ones. However, it's probably not much of a win,
memorywise, if most of the uses are for interleave and bind
policies--both of which would always need the nodemasks in addition to
the pointers.

Now, if we could replace the 'cpuset_mems_allowed' nodemask with a
pointer to something stable, it might be a win.

Lee


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/