Re: [Lse-tech] [PATCH] cpusets - big numa cpu and memory placement

From: Hubertus Franke
Date: Mon Aug 09 2004 - 11:09:02 EST


Please add ckrm-tech@xxxxxxxxxxxxxxxxxxxxx if CKRM isses are requested.

See further comments to this thread below.

-- Hubertus

Shailabh Nagar wrote:

Erich Focht wrote:

On Saturday 07 August 2004 08:10, Paul Jackson wrote:

Cpusets are a complex resource which needs to be managed. You already
provided an interface for management but on the horizon there is this
CKRM thing... I really don't care too much about the interface as long
as it is comfortable (advocating for your bitset manipulation routines
here ;-). CKRM will some day come in and maybe try to unify the
resource control through a generalized interface. In my understand
CKRM "classes" are (for the cpusets resource) your "sets". I was
trying to anticipate that CKRM might want to present the single entry
point for managing resources, including cpusets.


That is the intended utility of the CKRM core+interface, atleast for any resource for which it is useful to impose controls on a group of objects at once, as opposed to individually.


If I understand correctly, CKRM is fine for simple resources like
amount of memory or cputime and designed to control flexible sharing
of these resources and ensure some degree of fairness. Cpusets is a
complex NUMA specific compound resource which actually only allows for
a rather static distribution across processes (especially with the
exclusive bits set). Including cpusets control into CKRM will be
trivial, because you already provide all that's needed.


If we move to the new model where each controller has an independent hierarchy, this becomes a real possibility. We'd still need to negotiate on the interface. Implementationally its pretty simple....the main question is - should there be some uniformity in the interfaces at the /rcfs/<?> level for each controller or not. If there isn't, the only thing that CKRM brings to the table (for cpusets) is the filesystem.


What I proposed was to include cpusets ASAP. As we learned from
Hubertus, CKRM is undergoing some redesign (after the kernel summit),
so let's now get used to cpusets and forget about the generic resource
controller until that is mature to enter the kernel.


Let's look where the restructuring is conceptually heading.
As indicated by Shailabh above (and requested at the kernel summit),
the resource controllers are becoming external entities in that they
will be addressed directly by through the /rcfs/<rc>/<class-hierarchy>,
rather then indirectly through their association with the classtypes
right now.

In essense, the /rcfs interface can be used if a strict hierarchy can be
generated in the class hierarchy for a given resource.
Furthermore, each resource controller manipulates a set of attributes and constraints. Today we are talking about shares (min,max, guarantee).
There is no reason why these attributes/constraints can not be resource controller specific. For instance for the cpu sets, the attribute would be "cpus_allowed" and the controller would verify its own constraints,
such as cpus_allowed has to be a subset of its parents cpus.
Whether at this point "shares" is still the right filename is debateable.


Might ? :-) We think its a home run :-)

and the
cpusets user interface will be yet another filesystem for controlling
some hierarchical structures... The complaints about the huge size of
the patch should therefore have in mind that we might well get rid of
the user interface part of it. The core infrastructure of cpusets will
be needed anyway and the amount of code is the absolutely required
minimum, IMHO.



The other reason that this suggestion worries me is a bit more
philosophical. I'm sure that for all the other, well known,
resources that CKRM manages, no one is proposing replacing whatever
existing names and mechanisms exist for those resources, such as
bandwidth, compute cycles, memory, ... Rather I presume that CKRM
provides an additional resource management layer on top of the
existing resources, which retain their classic names and apparatus.
[...]



I hope cpusets will be an "existing resource" when CKRM comes into
play. It's a compound resource built of cpus and memories (and the
name cpuset is a bit misleading) but it fully makes sense on a NUMA
machine to have these two elementary resources glued together. If CKRM
was to build a resource controller for cpu masks and memories, or two
separate resource controllers, the really acceptable end result would
look like the current cpusets infrastructure. So why waste time?

Later cpusets could borrow the user interface of CKRM or, if the
cpusets user interface is better suited, maybe we can just have a
/rcfs/cpusets/ directory tree with the current cpusets look and feel?
Question to CKRM people: would it make sense to have a class with
another way of control than the shares/targets/members files?

See above.. I think if we relax the fixed attributes that currently
exist for "shares" and "stats" into something where the attribute
names are verified and interpreted by the resource controller than
that's effectively what you suggest here.



Need to mull this over in ckrm-tech, as mentioned earlier.
There are two issues:
- should controllers be allowed to create their own virtual files ?
- are all of the existing shares/targets/members files sufficiently useful to existing and future controllers to make them available by default (and offer the user some consistency) ?

I feel the answer to the second one is a yes though I'm not convinced that the attributes within the shares file need to be the same.

But saying yes to the first one will mean controllers have to implement some filesystem-related code (as is done by CKRM's Classification Engine modules, which also sit under /rcfs but have a completely different interface in terms of virtual files). We could work something out where controllers could use common code where available and then roll their own extras.

I don't think we need to worry about the file system here (yet).
rcfs takes care of the class object hierarchy and passes (as done today
in other cases ) its attribute-setting strings down to the resource controllers. We won't however have to do the parsing at /rcfs level.


If there's interest in this idea from the cpusets team and if we can come up with a way in which cpu/mem/io etc. could continue to share common rcfs code (as they do today) CKRM could consider this option.

-- Shailabh


-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
_______________________________________________
Lse-tech mailing list
Lse-tech@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/lse-tech


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/