Re: [ckrm-tech] Re: [Lse-tech] [RFC PATCH] scheduler: Dynamic sched_domains

From: Hubertus Franke
Date: Fri Oct 08 2004 - 11:01:11 EST




Nick Piggin wrote:

Erich Focht wrote:

more flexibility in building the sched_domains is badly needed, so
your effort towards providing this is the right step. I'm not sure
yet whether your big change is really (and already) a simplification,
but what you described sounded for me like getting the chance to
configure the sched_domains at runtime, dynamically, from user
space. I didn't notice any user interface in your patch, or overlooked
it. Could you please describe the API you had in mind for that?


OK, what we have in -mm is already close to what we need to do
dynamic building. But let's explore the other topic. User interface.

First of all, I think it may be easiest to allow the user to specify
which cpus belong to which exclusive domains, and have them otherwise
built in the shape of the underlying topology. So for example if your
domains look like this (excuse the crappy ascii art):

0 1 2 3 4 5 6 7
--- --- --- --- <- domain 0
| | | |
------ ------ <- domain 1
| |
---------- <- domain 2 (global)

And so you want to make a partition with CPUs {0,1,2,4,5}, and {3,6,7}
for some crazy reason, the new domains would look like this:

0 1 2 4 5 3 6 7
--- - --- - --- <- 0
| | | | |
----- - - - <- 1
| | | |
------- ----- <- 2 (global, partitioned)

Agreed? You don't need to get fancier than that, do you?

Then how to input the partitions... you could have a sysfs entry that
takes the complete partition info in the form:

0,1,2,3 4,5,6 7,8 ...

Pretty dumb and simple.


Agreed, what we are thinking is that the CKRM API can be used for that.
Each domain is a class build of resources (cpus,mem).
You use the config interface of CKRM to specify which cpu/mem belongs
to the class. The underlying controller verifies it.

For a first approximation, classes that have config constraints specified this way will not be allowed to set shares. In sched_domain
terms it would mean that if the sched_domain is not balancable with its
siblings then it forms an exclusive domain. Under the exclusive
class one can continue with the hierarchy that will allow share settings.

So from an API issue this certainly looks feasible, maybe even clean.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/