Re: cgroup: status-quo and userland efforts

From: Serge Hallyn
Date: Thu Jun 27 2013 - 09:22:24 EST


Quoting Mike Galbraith (bitbucket@xxxxxxxxx):
> On Wed, 2013-06-26 at 14:20 -0700, Tejun Heo wrote:
> > Hello, Tim.
> >
> > On Mon, Jun 24, 2013 at 09:07:47PM -0700, Tim Hockin wrote:
> > > I really want to understand why this is SO IMPORTANT that you have to
> > > break userspace compatibility? I mean, isn't Linux supposed to be the
> > > OS with the stable kernel interface? I've seen Linus rant time and
> > > time again about this - why is it OK now?
> >
> > What the hell are you talking about? Nobody is breaking userland
> > interface. A new version of interface is being phased in and the old
> > one will stay there for the foreseeable future. It will be phased out
> > eventually but that's gonna take a long time and it will have to be
> > something hardly noticeable. Of course new features will only be
> > available with the new interface and there will be efforts to nudge
> > people away from the old one but the existing interface will keep
> > working it does.
>
> I can understand some alarm. When I saw the below I started frothing at
> the face and howling at the moon, and I don't even use the things much.
>
> http://lists.freedesktop.org/archives/systemd-devel/2013-June/011521.html
>
> Hierarchy layout aside, that "private property" bit says that the folks
> who currently own and use the cgroups interface will lose direct access
> to it. I can imagine folks who have become dependent upon an on the fly
> management agents of their own design becoming a tad alarmed.

FWIW, the code is too embarassing yet to see daylight, but I'm playing
with a very lowlevel cgroup manager which supports nesting itself.
Access in this POC is low-level ("set freezer.state to THAWED for cgroup
/c1/c2", "Create /c3"), but the key feature is that it can run in two
modes - native mode in which it uses cgroupfs, and child mode where it
talks to a parent manager to make the changes.

So then the idea would be that userspace (like libvirt and lxc) would
talk over /dev/cgroup to its manager. Userspace inside a container
(which can't actually mount cgroups itself) would talk to its own
manager which is talking over a passed-in socket to the host manager,
which in turn runs natively (uses cgroupfs, and nests "create /c1" under
the requestor's cgroup).

At some point (probably soon) we might want to talk about a standard API
for these things. However I think it will have to come in the form of
a standard library, which knows to either send requests over dbus to
systemd, or over /dev/cgroup sock to the manager.

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/