Re: [RFC] cpuset: remove sched domain hooks from cpusets
From: Paul Jackson
Date: Sat Oct 21 2006 - 01:38:22 EST
> And whenever a child cpuset sets this use_cpus_exclusive flag, remove
> those set of child cpuset cpus from parent cpuset and also from the
> tasks which were running in the parent cpuset. We can probably allow this
> to happen as long as parent cpuset has atleast one cpu.
Why are you seeking out obfuscated means and gratuitous entanglements
with cpuset semantics, in order to accomplish a straight forward end -
defining the sched domain partitioning?
If we are going to add or modify the meaning of per-cpuset flags in
order to determine sched domain partitioning, then we should do so in
the most straight forward way possible, which by all accounts seems to
be adding a 'sched_domain' flag to each cpuset, indicating whether it
delineates a sched domain partition. The kernel would enforce a rule
that the CPUs in the cpusets so marked could not overlap. The kernel
in return would promise not to split the CPUs in any cpuset so marked
into more than one sched domain partition, with the consequence that
the kernel would be able to load balance across all the CPUs contained
within any such partition.
Why do something less straightforward than that?
Meanwhile ...
If the existing cpuset semantics implied a real and useful partitioning
of the systems CPUs, as Nick had been figuring it did, then yes it
might make good sense to implicitly and automatically leverage this
cpuset partitioning when partitioning the sched domains.
But cpuset semantics, quite deliberately on my part, don't imply any
such system wide partitioning of CPUs.
So one of:
1) we don't need sched domain partitioning after all, or
2) this sched domain partitioning takes on a hierarchical nested shape
that fits better with cpusets, or
3) we provide a "transparent, simple and obvious" API to user space,
so it can define the sched domain partitioning.
And in any case, we should first take a look at the rest of this sched
domain code, as I have been led to believe it provides some nice
opportunities for refinement, before we go fussing over the details of
any kernel-user API's it might need.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@xxxxxxx> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/