Re: [ckrm-tech] Re: [Lse-tech] [PATCH] cpusets - big numa cpu andmemory placement
From: Shailabh Nagar
Date: Wed Aug 11 2004 - 10:07:15 EST
Erich Focht wrote:
On Wednesday 11 August 2004 00:38, Shailabh Nagar wrote:
Metrics, transactions, tasks, and resource
decisions all have to be tracked or managed by Class.
These Classes form a fairly shallow hierarchy of usage levels or
service qualities, as perceived by the end users of the system.
I'd guess that the average lifetime of a Class is months or years,
as they can reflect the relative priority of relations with long
standing, external customers.
Cpusets and CKRM have profoundly different purposes, economics and
motivations.
I would say the methods differ, not the purpose. Both are trying to
performance-isolate groups of tasks - one uses the spatial dimension of
cpu bindings, the other uses the temporal dimension of cpu time.
So the purpose is different, too. With your words: spatial versus
temporal separation. They are orthogonal.
By purpose, I meant "performance isolation". Method used is spatial
vs. temporal. But I guess thats just quibbling over words. The
approaches are certainly orthogonal.
Also, cpusets have a purpose beyond isolation and that is
optimization. One might want to restrict tasks/apps to a NUMA node for
reducing avg mem latency - this is completely beyond CKRM's scope.
In physics terms: you need
both to describe the universe and you cannot transform the one into
the other. Both make sense, they can be combined to give more benefit
(aehm, control).
On machines with a fairly large number of cpus, this is true. cpusets
would partition a machine and CKRM would operate within each partition.
But its less clear whether both CKRM and cpuset approaches can be
simultaneously used, profitably, on a smaller SMP if one is primarily
interested in isolation.
Partitioning the cpus with cpusets does offer harder guarantees,
replicable isolation etc. but also runs the risk of underutilization.
If the user primarily wants to give 20% to one App, 40% to another, he
does have to make that call: go with cpusets which offers better
guarantees but could waste cpus or create ckrm classes which also
offer this functionality but run the risk of weaker control depending
on other applications load ?
To further complicate that choice, CKRM's design does provide for
implementation of hard vs. soft limits where hard limits would provide
the stronger guarantees that a user might want.
The CKRM CPU controller, in particular, is close (~ two weeks to
availablity) to providing an implementation of hard limits which would
offer stronger guarantees along the temporal dimension.
The other point of difference is the one you'd brought up earlier - ther
restrictions on the hierarchy creation. CKRM has none (effectively),
cpusets has many.
Don't know how it's exactly implemented, but the restrictions should
not be at hierarchy creation time (i.e. when creating the class
(cpusets) subdirectory). They should be imposed when setting/changing
the attributes.
True - I was lumping the "create cpuset + set its cpu ownership
values" into the hierarchy creation. But the point made still holds
good, CKRM has no controller-defined restrictions on changing
attributes, cpusets does.
Writing illegal values to the virtual attribute files
must simply fail. And each resource controller knows best what it
allows for and what not, this shouldn't be a task of the
infrastructure (CKRM).
Yes, this makes sense.
As CKRM's interface stands today, there are sufficient differences
between the interfaces to keep them separate.
However, if CKRM moves to a model where
- each controller is allowed to define its own virtual files and attributes
- each controllers has its own hierarchy (and hence more control over
how it can be formed),
then the similarities will be too many to ignore merger possibilities
altogether.
The kicker is, we've not decided. The splitting of controllers into
their own hierarchy is something we're considering independently (as a
consequence of Linus' suggestion at KS04). But making the interface
completely per-controller is something we can do, without too much
effort, IF there is sufficient reason (we have other reasons for doing
that as well - see recent postings on ckrm-tech).
Having controller specifics less hidden is good because usage becomes
more intuitive and you don't have to RTFM (controller specific manuals
would have to be written, too). One file per attribute is also nicer
than several attributes hidden in a shares files. Adding an attribute
means adding a file, it doesn't break the old interface, so this is
easier to maintain. And, as you mentioned, some files in the current
CKRM interface just don't make sense for some resources. But a sane
ruleset provided by CKRM for external controllers should be
there. For example something like:
- Class members are added by writing to the vitual file "target".
- Class members are listed by reading the virtual file "target" and
the format is ...
- Each class attribute should be controlled by one file named
appropriately. Etc...
- Members of a class can register a callback which will be invoked
when following events occur:
- the class is destroyed
- ... ?
- etc ...
One file per attribute is an excellent idea and the slight additional
overhead won't matter since attribute changes are rarely in the
critical path. Will follow up on this on ckrm-tech (which is cc'ed).
We'll still need to keep statistics grouped as far as possible because
the overhead of reading several files vs. one will matter.
Interest/recommendations from the community that cpusets be part of
CKRM's hierarchy would certainly be a factor in that decision.
I'd prefer a single entry point for resource management with
consistent (not necessarilly same) and easy to use user interfaces for
all resources.
Regards,
Erich
P.S. I've pruned some of the names on the cc: list who are obviously
subscribed to one or the other lists (mailman on sf keeps complaining
if the cc list is too long). I can be dropped from the cc: too if this
thread continues...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/