Re: [Lse-tech] [PATCH] cpusets - big numa cpu and memory placement

From: Matthew Dobson
Date: Mon Oct 11 2004 - 18:07:06 EST


On Fri, 2004-10-08 at 17:18, Nick Piggin wrote:
> Matthew Dobson wrote:
> > I think this example is easily achievable with the sched_domains
> > modifications I am proposing. You can still create your 128 CPU
> > exclusive domain, called big_domain (due to my lack of naming
> > creativity), and further divide big_domain into smaller, non-exclusive
> > sched_domains. We do this all the time, albeit statically at boot time,
> > with the current sched_domains code. When we create a 4-node domain on
> > IA64, and underneath it we create 4 1-node domains. We've now
> > partitioned the system into 4 sched_domains, each containing 4 cpus.
> > Balancing between these 4 node-level sched_domains is allowed, but can
> > be disallowed by not setting the SD_LOAD_BALANCE flag. Your example
> > does show that it can be more than just a convenient way to group tasks,
> > but your example can be done with what I'm proposing.
>
> You wouldn't be able to do this just with sched domains, because
> it doesn't know anything about individual tasks. As soon as you
> have some overlap, all your tasks can escape out of your domain.
>
> I don't think there is a really nice way to do overlapping sets.
> Those that want them need to just use cpu affinity for now.

Well, the tasks can escape out of the domain iff you have the
SD_LOAD_BALANCE flag set on that domain. If SD_LOAD_BALANCE isn't set,
then when the scheduler tick goes off, and the code looks at the domain,
it will see the lack of the flag and will not attempt to balance the
domain, correct? This is what we currently do with the 'isolated'
domains, right?

You're right that you can get some of the more obscure semantics of the
various flavors of cpusets by leveraging sched_domains AND
cpus_allowed. I don't have any desire to remove that ability, just keep
it as the exception.

-Matt

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/