First, thank-you, Hubertus, for comparing me to a puppy, rather
than a kitten. I am definitely a dog person, not a cat person,
and I appreciate your considerate choice of analog.
I gather from the tone of your post yesterday that there is
a disconnect between us - you speak with the frustration of
someone who has been shouting into the wind and not being
heard.
I suspect that the disconnect, if such be, is not where you
think it is:
Hubertus wrote:
The disconnect is that you do not want to recognize that CKRM does NOT have to be systemwide. Once you open your mind to the fact that CKRM can be deployed with in a subset of disconnected resources (cpu domains)
and manages shares independently within that domain, I truely don't see
what the problem is.
I have recognized for months that eventually we'd want to allow
for cpuset-relative CKRM domains, and I'm pretty sure I've
dropped comments to that affect one time or another here on lkml.
I suspect instead that "CKRM" is one layer more abstract than
I am normally comfortable with.
As best as I can tell, CKRM has evolved from its origins as a
fair share scheduler, into a framework (*) for things called by
such names as classes and controllers. As you may recall from
an inconclusive thread between us on the ckrm-tech email list two
months ago, I find those terms uncomfortably vague and abstract.
In general, frameworks are high risk business. What they
gain in generality, covering a wider range of situations in
a uniform pattern, they lose in down to earth concreteness,
leaving their users less confident of what works, and less able
to rely on their intuitions. The risk of serious design flaws,
shrouded for a long time in the fog of abstraction, is higher.
The more successful frameworks, such as vfs for example,
typically have deep roots in prior art, and a sizable population
of journeyman and master practitioners.
CKRM is young, its roots more shallow, and the population of
its practitioners small.
(*) P.S. - It's more like CKRM is now the combination of
a virtual resource manager framework and a particular
instance of such (the fair shair controllers that have
their conceptual origins in IBM's WLM, I suspect). If
numa placement controllers (aka cpusets) are going to
exist as well, then CKRM needs to split into (1) a
virtual resource manager framework (vrm), and (2) the
fair share stuff. The vrm framework should be neutral
of either fair share or numa placement bias.
===
Putting aside for a moment my personal frustrations (which
are after all my problem - and my dogs) I am simply unable to
make sense yet of how deep would be the hit on the capabilities
of cpusets, if so morphed, and I am painfully aware of the
undetermined schedule delays and increased risks to product
performance and even ultimate success that attend such a change.
From what my field engineers tell me, whom I've been polling
furiously on this matter the last few days, at least in the
markets that SGI frequents, there is very little overlap between
system configurations which benefit from fair share resource
management and those which benefit from numa placement resource
management. So, if that experience is generally applicable, we
are at risk of marrying a helicopter and a boat, just because
both have a motor and a hull, to the detriment of both.
But by self admission, you are driven by timing constraints as
Merging projects always has risks. The payoff for synergies
gained is not always greater than the cost of the inefficiencies
and compromises introduced, and the less immediate involvement
of the participants in the end result.
I cannot in good conscience recommend such a change.
Keep talking.