Re: [patch 2/2] cpusets: add interleave_over_allowed option

From: David Rientjes
Date: Sat Oct 27 2007 - 13:50:59 EST


On Fri, 26 Oct 2007, Paul Jackson wrote:

> > Yes. We should default to Choice B. Add an option MPOL_MF_RELATIVE to
> > enable that functionality? A new version of numactl can then enable
> > that by default for newer applications.
>
> I'm confused. If B is the default, then we don't need a flag to
> enable it, rather we need a flag to go back to the old choice A.
>

I think there's a mixup in the flag name there, but I actually would
recommend against any flag to effect Choice A. It's simply going to be
too complex to describe and is going to be a headache to code and support.

The MPOL_PREFERRED behavior when constrained by cpusets was previously, to
my knowledge, undocumented; you're in the position to make the behavior do
what you want it to do and then release documentation so we'll finally
have a complete and unambiguous API for it. Right now it should be
considered undefined and thus you are free to implement it as you choose.
Then all callers of set_mempolicy(MPOL_PREFERRED) will standardize on that
and not have to worry about the machine's
mpol_preferred_relative_to_cpuset setting.

Then, any task that is attached to a cpuset and expecting the fourth node
in their set_mempolicy(MPOL_PREFERRED) call to mean system node 3 if
it's in the cpuset's mems_allowed will be broken. If you want that,
you'll need your task to be attached to a cpuset with at least mems 0-3;
programmers will pick that up quickly enough if it's clearly documented.
I think Choice B is correct and makes more sense in terms of the semantics
and at least allows mempolicies and cpusets to play nicely together
without a bidirectional dependency on one another.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/