Re: What can OpenVZ do?

From: Andrew Morton
Date: Thu Feb 12 2009 - 17:11:00 EST


On Thu, 12 Feb 2009 13:51:23 -0800
Dave Hansen <dave@xxxxxxxxxxxxxxxxxx> wrote:

> On Thu, 2009-02-12 at 11:42 -0800, Andrew Morton wrote:
> > On Thu, 12 Feb 2009 13:30:35 -0600
> > Matt Mackall <mpm@xxxxxxxxxxx> wrote:
> >
> > > On Thu, 2009-02-12 at 10:11 -0800, Dave Hansen wrote:
> > >
> > > > > - In bullet-point form, what features are missing, and should be added?
> > > >
> > > > * support for more architectures than i386
> > > > * file descriptors:
> > > > * sockets (network, AF_UNIX, etc...)
> > > > * devices files
> > > > * shmfs, hugetlbfs
> > > > * epoll
> > > > * unlinked files
> > >
> > > > * Filesystem state
> > > > * contents of files
> > > > * mount tree for individual processes
> > > > * flock
> > > > * threads and sessions
> > > > * CPU and NUMA affinity
> > > > * sys_remap_file_pages()
> > >
> > > I think the real questions is: where are the dragons hiding? Some of
> > > these are known to be hard. And some of them are critical checkpointing
> > > typical applications. If you have plans or theories for implementing all
> > > of the above, then great. But this list doesn't really give any sense of
> > > whether we should be scared of what lurks behind those doors.
> >
> > How close has OpenVZ come to implementing all of this? I think the
> > implementatation is fairly complete?
>
> I also believe it is "fairly complete". At least able to be used
> practically.
>
> > If so, perhaps that can be used as a guide. Will the planned feature
> > have a similar design? If not, how will it differ? To what extent can
> > we use that implementation as a tool for understanding what this new
> > implementation will look like?
>
> Yes, we can certainly use it as a guide. However, there are some
> barriers to being able to do that:
>
> dave@nimitz:~/kernels/linux-2.6-openvz$ git diff v2.6.27.10... | diffstat | tail -1
> 628 files changed, 59597 insertions(+), 2927 deletions(-)
> dave@nimitz:~/kernels/linux-2.6-openvz$ git diff v2.6.27.10... | wc
> 84887 290855 2308745
>
> Unfortunately, the git tree doesn't have that great of a history. It
> appears that the forward-ports are just applications of huge single
> patches which then get committed into git. This tree has also
> historically contained a bunch of stuff not directly related to
> checkpoint/restart like resource management.
>
> We'd be idiots not to take a hard look at what has been done in OpenVZ.
> But, for the time being, we have absolutely no shortage of things that
> we know are important and know have to be done. Our largest problem is
> not finding things to do, but is our large out-of-tree patch that is
> growing by the day. :(
>

Well we have a chicken-and-eggish thing. The patchset will keep
growing until we understand how much of this:

> dave@nimitz:~/kernels/linux-2.6-openvz$ git diff v2.6.27.10... | diffstat | tail -1
> 628 files changed, 59597 insertions(+), 2927 deletions(-)

we will be committed to if we were to merge the current patchset.


Now, we've gone in blind before - most notably on the
containers/cgroups/namespaces stuff. That hail mary pass worked out
acceptably, I think. Maybe we got lucky. I thought that
net-namespaces in particular would never get there, but it did.

That was a very large and quite long-term-important user-visible
feature.

checkpoint/restart/migration is also a long-term-...-feature. But if
at all possible I do think that we should go into it with our eyes a
little less shut.

Interestingly, there was also prior-art for
containers/cgroups/namespaces within OpenVZ. But we decided up-front
(I think) that the eventual implementation would have little in common
with preceding implementations.


Oh, and I'd disagree with your new Subject:. It's pretty easy to find
out what OpenVZ can do. The more important question here is "how much
of a mess did it make when it did it?"
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/