RE: [ckrm-tech] Re: [Lse-tech] [PATCH] cpusets - big numa cpu and memory placement

From: Marc E. Fiuczynski
Date: Sat Oct 02 2004 - 10:50:51 EST


Paul & Andrew,

For PlanetLab (www.planet-lab.org) we also care very much about isolation
between different users. Maybe not to the same degree as your users.
Nonetheless, penning in resource hogs is very important to us. We are
giving CKRM a shot. Over the past two weeks I have worked with Hubertus,
Chandra, and Shailabh to iron various bugs. The controllers appear to be
working at first approximation. From our perspective, it is not so much the
specific resource controllers but the CKRM framework that is of importance.
I.e., we certainly plan to test and implement other resource controllers for
CPU, disk I/o and memory isolation.

For cpu isolation, would it suffice to use a HTB-based cpu scheduler. This
is essentially what the XEN folks are using to ensure strong isolation
between separate Xen domains. An implementation of such a scheduler exists
as part of the linux-vserver project and the port of that to CKRM should be
straightforward. In fact, I am thinking of doing such a port for PlanetLab
just to have an alternative to the existing CKRM cpu controller. Seems like
an implementation of that scheduler (or a modification to the existing CKRM
controller) + some support for CPU affinity + hotplug CPU support might
approach your cpuset solution. Correct me if I completely missed it.

For memory isolation, I am not sufficiently familiar with NUMA style
machines to comment on this topic. The CKRM memory controller is
interesting, but we have not used it sufficiently to comment.

Finally, in terms of isolation, we have mixed together CKRM with VSERVERs.
Using CKRM for performance isolation and Vserver (for the lack of a better
name) "view" isolation. Maybe your users care about the vserver style of
islation. We have an anon cvs server with our kernel (which is based on
Fedora Core 2 1.521 + vserver 1.9.2 + the latest ckrm e16 framework and
resource controllers that are not even available yet at ckrm.sf.net), which
you are welcome to play with.

Best regards,
Marc

-----------
Marc E. Fiuczynski
PlanetLab Consortium --- OS Taskforce PM
Princeton University --- Research Scholar
http://www.cs.princeton.edu/~mef

> -----Original Message-----
> From: ckrm-tech-admin@xxxxxxxxxxxxxxxxxxxxx
> [mailto:ckrm-tech-admin@xxxxxxxxxxxxxxxxxxxxx]On Behalf Of Andrew Morton
> Sent: Friday, October 01, 2004 7:41 PM
> To: Shailabh Nagar; ckrm-tech@xxxxxxxxxxxxxxxxxxxxx
> Cc: pj@xxxxxxx; efocht@xxxxxxxxxxxx; mbligh@xxxxxxxxxxx;
> lse-tech@xxxxxxxxxxxxxxxxxxxxx; hch@xxxxxxxxxxxxx; steiner@xxxxxxx;
> jbarnes@xxxxxxx; sylvain.jeaugey@xxxxxxxx; djh@xxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; colpatch@xxxxxxxxxx; Simon.Derr@xxxxxxxx;
> ak@xxxxxxx; sivanich@xxxxxxx
> Subject: [ckrm-tech] Re: [Lse-tech] [PATCH] cpusets - big numa cpu and
> memory placement
>
>
>
> Paul, I'm having second thoughts regarding a cpusets merge. Having gone
> back and re-read the cpusets-vs-CKRM thread from mid-August, I am quite
> unconvinced that we should proceed with two orthogonal resource
> management/partitioning schemes.
>
> And CKRM is much more general than the cpu/memsets code, and hence it
> should be possible to realize your end-users requirements using an
> appropriately modified CKRM, and a suitable controller.
>
> I'd view the difficulty of implementing this as a test of the wisdom of
> CKRM's design, actually.
>
> The clearest statement of the end-user cpu and memory partitioning
> requirement is this, from Paul:
>
> > Cpusets - Static Isolation:
> >
> > The essential purpose of cpusets is to support isolating large,
> > long-running, multinode compute bound HPC (high performance
> > computing) applications or relatively independent service jobs,
> > on dedicated sets of processor and memory nodes.
> >
> > The (unobtainable) ideal of cpusets is to provide perfect
> > isolation, for such jobs as:
> >
> > 1) Massive compute jobs that might run hours or days, on dozens
> > or hundreds of processors, consuming gigabytes or terabytes
> > of main memory. These jobs are often highly parallel, and
> > carefully sized and placed to obtain maximum performance
> > on NUMA hardware, where memory placement and bandwidth is
> > critical.
> >
> > 2) Independent services for which dedicated compute resources
> > have been purchased or allocated, in units of one or more
> > CPUs and Memory Nodes, such as a web server and a DBMS
> > sharing a large system, but staying out of each others way.
> >
> > The essential new construct of cpusets is the set of dedicated
> > compute resources - some processors and memory. These sets have
> > names, permissions, an exclusion property, and can be subdivided
> > into subsets.
> >
> > The cpuset file system models a hierarchy of 'virtual computers',
> > which hierarchy will be deeper on larger systems.
> >
> > The average lifespan of a cpuset used for (1) above is probably
> > between hours and days, based on the job lifespan, though a couple
> > of system cpusets will remain in place as long as the system is
> > running. The cpusets in (2) above might have a longer lifespan;
> > you'd have to ask Simon Derr of Bull about that.
> >
>
> Now, even that is not a very good end-user requirement because it does
> prejudge the way in which the requirement's solution should be
> implemented.
> Users don't require that their NUMA machines "model a hierarchy of
> 'virtual computers'". Users require that their NUMA machines implement
> some particular behaviour for their work mix. What is that behaviour?
>
> For example, I am unable to determine from the above whether the users
> would be 90% satisfied with some close-enough ruleset which was
> implemented
> with even the existing CKRM cpu and memory governors.
>
> So anyway, I want to reopen this discussion, and throw a huge spanner in
> your works, sorry.
>
> I would ask the CKRM team to tell us whether there has been any
> progress in
> this area, whether they feel that they have a good understanding
> of the end
> user requirement, and to sketch out a design with which CKRM could satisfy
> that requirement.
>
> Thanks.
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
> Use IT products in your business? Tell us what you think of them. Give us
> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to
> find out more
> http://productguide.itmanagersjournal.com/guidepromo.tmpl
> _______________________________________________
> ckrm-tech mailing list
> https://lists.sourceforge.net/lists/listinfo/ckrm-tech

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/