Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server

From: Avi Kivity
Date: Wed Sep 16 2009 - 17:01:32 EST

Next message: Tim Abbott: "[PATCH v3 1/5] x86: fix fragile computation of vsyscall address"
Previous message: Frederik Deweerdt: "Re: Oops in drivers\base\firmware_class"
In reply to: Gregory Haskins: "Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server"
Next in thread: Gregory Haskins: "Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 09/16/2009 10:22 PM, Gregory Haskins wrote:

Avi Kivity wrote:

On 09/16/2009 05:10 PM, Gregory Haskins wrote:

If kvm can do it, others can.

The problem is that you seem to either hand-wave over details like this,
or you give details that are pretty much exactly what vbus does already.
My point is that I've already sat down and thought about these issues
and solved them in a freely available GPL'ed software package.

In the kernel. IMO that's the wrong place for it.

3) "in-kernel": You can do something like virtio-net to vhost to
potentially meet some of the requirements, but not all.

In order to fully meet (3), you would need to do some of that stuff you
mentioned in the last reply with muxing device-nr/reg-nr. In addition,
we need to have a facility for mapping eventfds and establishing a
signaling mechanism (like PIO+qid), etc. KVM does this with
IRQFD/IOEVENTFD, but we dont have KVM in this case so it needs to be
invented.

irqfd/eventfd is the abstraction layer, it doesn't need to be reabstracted.

To meet performance, this stuff has to be in kernel and there has to be
a way to manage it.

and management belongs in userspace.

Since vbus was designed to do exactly that, this is
what I would advocate. You could also reinvent these concepts and put
your own mux and mapping code in place, in addition to all the other
stuff that vbus does. But I am not clear why anyone would want to.

Maybe they like their backward compatibility and Windows support.

So no, the kernel is not the wrong place for it. Its the _only_ place
for it. Otherwise, just use (1) and be done with it.

I'm talking about the config stuff, not the data path.

Further, if we adopt
vbus, if drop compatibility with existing guests or have to support both
vbus and virtio-pci.

We already need to support both (at least to support Ira). virtio-pci
doesn't work here. Something else (vbus, or vbus-like) is needed.

virtio-ira.

So the question is: is your position that vbus is all wrong and you wish
to create a new bus-like thing to solve the problem?

I don't intend to create anything new, I am satisfied with virtio. If
it works for Ira, excellent. If not, too bad.

I think that about sums it up, then.

Yes. I'm all for reusing virtio, but I'm not going switch to vbus or support both for this esoteric use case.

If so, how is it
different from what Ive already done? More importantly, what specific
objections do you have to what Ive done, as perhaps they can be fixed
instead of starting over?

The two biggest objections are:
- the host side is in the kernel

As it needs to be.

vhost-net somehow manages to work without the config stuff in the kernel.

With all due respect, based on all of your comments in aggregate I
really do not think you are truly grasping what I am actually building here.

Thanks.

Bingo. So now its a question of do you want to write this layer from
scratch, or re-use my framework.

You will have to implement a connector or whatever for vbus as well.
vbus has more layers so it's probably smaller for vbus.

Bingo!

(addictive, isn't it)

That is precisely the point.

All the stuff for how to map eventfds, handle signal mitigation, demux
device/function pointers, isolation, etc, are built in. All the
connector has to do is transport the 4-6 verbs and provide a memory
mapping/copy function, and the rest is reusable. The device models
would then work in all environments unmodified, and likewise the
connectors could use all device-models unmodified.

Well, virtio has a similar abstraction on the guest side. The host side abstraction is limited to signalling since all configuration is in userspace. vhost-net ought to work for lguest and s390 without change.

It was already implemented three times for virtio, so apparently that's
extensible too.

And to my point, I'm trying to commoditize as much of that process as
possible on both the front and backends (at least for cases where
performance matters) so that you don't need to reinvent the wheel for
each one.

Since you're interested in any-to-any connectors it makes sense to you. I'm only interested in kvm-host-to-kvm-guest, so reducing the already minor effort to implement a new virtio binding has little appeal to me.

You mean, if the x86 board was able to access the disks and dma into the
ppb boards memory? You'd run vhost-blk on x86 and virtio-net on ppc.

But as we discussed, vhost doesn't work well if you try to run it on the
x86 side due to its assumptions about pagable "guest" memory, right? So
is that even an option? And even still, you would still need to solve
the aggregation problem so that multiple devices can coexist.

I don't know. Maybe it can be made to work and maybe it cannot. It probably can with some determined hacking.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Tim Abbott: "[PATCH v3 1/5] x86: fix fragile computation of vsyscall address"
Previous message: Frederik Deweerdt: "Re: Oops in drivers\base\firmware_class"
In reply to: Gregory Haskins: "Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server"
Next in thread: Gregory Haskins: "Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]