On Wed, 2004-10-06 at 19:13, Nick Piggin wrote:
Matthew Dobson wrote:
This should allow us to support hotplug more easily, simply removing theAlthough what we have in -mm now should support CPU hotplug just fine.
domain belonging to the going-away CPU, rather than throwing away the
whole domain tree and rebuilding from scratch.
The hotplug guys really seem not to care how disruptive a hotplug
operation is.
I wasn't trying to imply that CPU hotplug isn't supported right now. But it is currently a very disruptive operation, throwing away the
entire sched_domains & sched_groups tree and then rebuilding it from
scratch just to remove a single CPU! I also understand that this is
supposed to be a rare event (CPU hotplug), but that doesn't mean it
*has* to be a slow, disruptive event. :)
This should also allowHmm, what was my word for them... yeah, disjoint. We can do that now,
us to support multiple, independent (ie: no shared root) domain trees
which will facilitate isolated CPU groups and exclusive domains. I also
see isolcpus= for a subset of the functionality you want (doing larger
exclusive sets would probably just require we run the setup code once
for each exclusive set we want to build).
The current code doesn't, to my knowledge support multiple isolated
domains. You can set up a single 'isolated' group with boot time
options, but you can't set up *multiple* isolated groups, nor is there
the ability to do any partitioning/isolation at runtime. This was more
of the motivation for my code than the hotplug simplification. That was
more of a side-benefit.
hope this will allow us to leverage the existing topology infrastructureThis is what I did in my first (that nobody ever saw) implementation of
to build domains that closely resemble the physical structure of the
machine automagically, thus making supporting interesting NUMA machines
and SMT machines easier.
This patch is just a snapshot in the middle of development, so there are
certainly some uglies & bugs that will get fixed. That said, any
comments about the general design are strongly encouraged. Heck, any
feedback at all is welcome! :)
Patch against 2.6.9-rc3-mm2.
sched domains. Ie. no sched_groups, just use sched_domains as the balancing
object... I'm not sure this works too well.
For example, your bottom level domain is going to basically be a redundant,
single CPU on most topologies, isn't it?
Also, how will you do overlapping domains that SGI want to do (see
arch/ia64/kernel/domain.c in -mm kernels)?
node2 wants to balance between node0, node1, itself, node3, node4.
node4 wants to balance between node2, node3, itself, node5, node6.
etc.
I think your lists will get tangled, no?
Yes. I have to put my thinking cap on snug, but I don't think my
version would support this kind of setup. It sounds, from Jesse's
follow up to your mail, that this is not a requirement, though. I'll
take a closer look at the IA64 code and see if it would be supported or
if I could make some small changes to support it.
Thanks for the feedback!!