Paul M wrote:Hmm, I think we're mixing two different threads here.they will now have to unset it in the 'boot' set as well.That can break existing userspace, so I presume PaulJ isn't in favour
of this change.
You're right - I don't favor it.
Using the 'cpus' in one or more cpusets to determine both:I'm not sure #2 is a concern. With the latest couset irq handling patches conflict resolution is very simple. "irq can belong to a single cpuset at a time".
1) which CPUs can receive an irq, and
2) resolving conflicts in such irq placement,
excessively overloads the cpuset hierarchy, breaking existing
userspace, as Paul M notes.
If you don't have any other cpuset hierarchy you need to use, and
so don't really otherwise care what your cpuset hierarchy is, then
I suppose this works just fine.
But if you also need to use the cpuset hierarchy to define nestedI do not think we need overlapping irq directives.
subsets of CPUs and Memory Nodes, for the purposes of controlling
which tasks can run where (the original and still primary motivation
for cpusets) then one can only conveniently specify those trivial
irq configurations that happen to exactly conform with that hierarchy
(that exactly want to make use of some of the same sets of CPUs, and
that don't depend on the hierarchy to resolve conflicts in overlapping
irq directives).
Almost any non-trivial use of cpusets for both irq directivity and CPUI'm not sure what breakage you're talking about. But lets talk examples I guess. See below.
and Memory placement would complicate both hierarchies, forcing
unending confusion and breakage on the existing cpuset users.
Some examples:How is that any different from tasks ? Exact same example right back at you.
Let's say I have three cpusets defining the CPU and Memory Node
sets in which I want to place my tasks:
/dev/cpuset/A
/dev/cpuset/B
/dev/cpuset/C
and I want a particular set of irqs to be directed to the CPUs in A
and B, but not C. Well -- guess I can duplicate the irqs settings.
But don't tell me to use a 'boot' cpuset, as in:
/dev/cpuset/boot/A
/dev/cpuset/boot/B
/dev/cpuset/C
to accomplish this, as that intrudes in the hierarchy, breaking
user code.
If my irq isolation needs don't exactly partition along the
'cpus' settings in A, B and C, then not even duplication helps.
If the 'irqs' in /dev/cpuset/A/Z (where Z's cpus are a proper
subset of A's) don't match the 'irqs' in /dev/cpuset/A, then I
have further confusions resulting from conflicting irq directives.
(If your proposal handles all the above, without forcing changesIt does not force any changes. irqs handled just like tasks and if people have complex partitioning requirements they may have to use more complicated hierarchies.
on the cpuset hierarchy, then I misread it - in that case, sorry.)
Paul M has already proposed pulling apart the binding of CPUs andThat (ie additional sets of irqs) seems like an major overkill to me.
Memory Nodes, in the underlying cgroups, as he apparently has cases in
which the legacy connection of those two into a single cpuset hierarchy
is an undesired constraint on (complication of) the hierarchy.
That's more likely the direction in which we should be proceeding --
making these hierarchies independent, not entwining them.
This additional overloading of the current cpuset hierarchy might
handle the simple case you need. But that's only because you don't
have conflicting needs for the cpuset hierarchy.
Hopefully, Paul M will be able to view with some sense of humor that I
am complaining that this proposal of yourself (and Peter Z's earlier
patches) isn't general enough, even as I have complained of some of
some other recent cgroup proposals of Paul M that their increased
generality isn't sufficient to justify their subtle incompatibilities.
At a minimum, as in my proposal (http://lkml.org/lkml/2008/3/6/512) of
last week, one needs some mechanism independent of the cpuset hierarchy
to resolve conflicts in these irq directives. As you may recall,
that proposal named each set of irqs, let each cpuset specify which
named set of irqs applied to its CPUs, and encoded the precedence N
of each named list of irqs in the filename '/dev/cpuset/irqs.N.name'
of the file listing the irqs in that named set. Then one can specify
irqs for each cpuset, and have some way to specify the precedence of
these irq specifications, without overloading the cpuset hierarchy.
Even this minimum proposal might be insufficient, if one has needs
to specify irq directives for sets of CPUs that are not otherwise
present in the cpuset hierarchy. Observe that this proposal does
not handle the next to the last example case above. I am not yet
convinced that this deficiency is a show stopper. It might be.
The other direction considered, making this its own cgroup, -seemed-Hold on. How does this help if at the end of the day 'cpus' are still shared between the irq and task groups ? We'd still have exact same constrains.
to fail as well, as someone, I forget whom, noted. Cgroups attach
tasks to sets of things. We aren't trying to attach tasks to anything.
We're trying to attach irqs to CPUs. We are trying now to treat irqs
as 'pseudo-tasks', but that forces the irq hierarchy to be a subset
of the CPU hierarchy, due to overloading the 'cpus' set. This is the
problem noted above.
Paul M -- could we take a different tack here -- extend cgroups to map
-either- tasks or irqs to the managed resources? Then irqs would be
managed by a cgroup hierarchy that mapped irqs to a subsystem specific
attribute of 'cpus' (resembling the cpuset 'cpus'). If the hierarchy
one needed for irqs was a nice subset of ones cpuset hierarchy, one
might even mount both cgroup subsystems on the same mount, so long
as we could work out what it means for two cgroup subsystems to share
the same subsystem specific attribute, 'cpus' in this case.