Re: [Lse-tech] [PATCH] cpusets - big numa cpu and memory placement

From: Andrew Morton
Date: Fri Oct 01 2004 - 18:41:12 EST



Paul, I'm having second thoughts regarding a cpusets merge. Having gone
back and re-read the cpusets-vs-CKRM thread from mid-August, I am quite
unconvinced that we should proceed with two orthogonal resource
management/partitioning schemes.

And CKRM is much more general than the cpu/memsets code, and hence it
should be possible to realize your end-users requirements using an
appropriately modified CKRM, and a suitable controller.

I'd view the difficulty of implementing this as a test of the wisdom of
CKRM's design, actually.

The clearest statement of the end-user cpu and memory partitioning
requirement is this, from Paul:

> Cpusets - Static Isolation:
>
> The essential purpose of cpusets is to support isolating large,
> long-running, multinode compute bound HPC (high performance
> computing) applications or relatively independent service jobs,
> on dedicated sets of processor and memory nodes.
>
> The (unobtainable) ideal of cpusets is to provide perfect
> isolation, for such jobs as:
>
> 1) Massive compute jobs that might run hours or days, on dozens
> or hundreds of processors, consuming gigabytes or terabytes
> of main memory. These jobs are often highly parallel, and
> carefully sized and placed to obtain maximum performance
> on NUMA hardware, where memory placement and bandwidth is
> critical.
>
> 2) Independent services for which dedicated compute resources
> have been purchased or allocated, in units of one or more
> CPUs and Memory Nodes, such as a web server and a DBMS
> sharing a large system, but staying out of each others way.
>
> The essential new construct of cpusets is the set of dedicated
> compute resources - some processors and memory. These sets have
> names, permissions, an exclusion property, and can be subdivided
> into subsets.
>
> The cpuset file system models a hierarchy of 'virtual computers',
> which hierarchy will be deeper on larger systems.
>
> The average lifespan of a cpuset used for (1) above is probably
> between hours and days, based on the job lifespan, though a couple
> of system cpusets will remain in place as long as the system is
> running. The cpusets in (2) above might have a longer lifespan;
> you'd have to ask Simon Derr of Bull about that.
>

Now, even that is not a very good end-user requirement because it does
prejudge the way in which the requirement's solution should be implemented.
Users don't require that their NUMA machines "model a hierarchy of
'virtual computers'". Users require that their NUMA machines implement
some particular behaviour for their work mix. What is that behaviour?

For example, I am unable to determine from the above whether the users
would be 90% satisfied with some close-enough ruleset which was implemented
with even the existing CKRM cpu and memory governors.

So anyway, I want to reopen this discussion, and throw a huge spanner in
your works, sorry.

I would ask the CKRM team to tell us whether there has been any progress in
this area, whether they feel that they have a good understanding of the end
user requirement, and to sketch out a design with which CKRM could satisfy
that requirement.

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/