Re: [ckrm-tech] [RFC] Resource Management - Infrastructure choices

From: Paul Menage
Date: Tue Nov 07 2006 - 14:08:32 EST


On 11/7/06, Paul Jackson <pj@xxxxxxx> wrote:

I see the following usage patterns -- I wonder if we can see a way to
provide for all these. I will speak in terms of just cpusets and
resource groups, as examplars of the variety of controllers that might
make good use of Paul M's containers:

Could we (Paul M?) find a way to build a single kernel that supports:

1) Someone just using cpusets wants to do:
mount -t cpuset cpuset /dev/cpuset
and then see the existing cpuset API. Perhaps other files show
up in the cpuset directories, but at least all the existing
ones provided by the current cpuset API, with their existing
behaviours, are all there.

This will happen if you configure CONFIG_CPUSETS_LEGACY_API


2) Someone wanting a good CKRM/ResourceGroup interface, doing
whatever those fine folks are wont to do, binding some other
resource group controller to a container hierarchy.

This works now.


3) Someone, in the future, wanting to "bind" cpusets and resource
groups together, with a single container based name hierarchy
of sets of tasks, providing both the cpuset and resource group
control mechanisms. Code written for (1) or (2) should work,
though there is a little wiggle room for API 'refinements' if
need be.

That works now.


4) Someone doing (1) and (2) separately and independently on the
same system at the same time, with separate and independent
partitions (aka container hierarchies) of that systems tasks.

Right now you can't have multiple independent hierarchies - each
subsystem either has the same hierarchy as all the other subsystems,
or has just a single node and doesn't participate in the hierarchy.


The initial customer needs are for (1), which preserves an existing
kernel API, and on separate systems, for (2). Providing for both on
the same system, as in (3) with a single container hierarchy or even
(4) with multiple independent hierarchies, is an enhancement.

I forsee a day when user level software, such as batch schedulers, are
written to take advantage of (3), once the kernel supports binding
multiple controllers to a common task container hierarchy. Initially,
some systems will need cpusets, and some will need resource groups, and
the intersection of these requiring both, whether bound as in (3), or
independent as in (4), will be pretty much empty.

I don't know about group (4), but we certainly have a big need for (3).


In general then, we will have several controllers (need a good way
for user space to list what controllers, such as cpusets and resource
groups,

I think it's better to treat resource groups as a common framework for
resource controllers, rather than a resource controller itself.
Otherwise we'll have the same issues of wanting to treat separate
resources in separate hierarchies - by treating each RG controller as
a separate entitiy sharing a common resource metaphor and user API,
you get the multiple hierarchy support for free.


Perhaps the interface to binding multiple controllers to a single container
hierarchy is via multiple mount commands, each of type 'container', with
different options specifying which controller(s) to bind. Then the
command 'mount -t cpuset cpuset /dev/cpuset' gets remapped to the command
'mount -t container -o controller=cpuset /dev/cpuset'.

Yes, that's the aproach that I'm thinking of currently. It should
require pretty reasonably robotic changes to the existing code.

One of the issues that crops up with it is what do you put in
/proc/<pid>/container if there are multiple hierarchies?

Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/