Actually, no. Generally, _KVM_ puts those things into the kernel, andThere are various aspects about designing high-performance virtualExactly. That's what vhost puts into the kernel and nothing more.
devices such as providing the shortest paths possible between the
physical resources and the consumers. Conversely, we also need to
ensure that we meet proper isolation/protection guarantees at the same
time. What this means is there are various aspects to any
high-performance PV design that require to be placed in-kernel to
maximize the performance yet properly isolate the guest.
For instance, you are required to have your signal-path (interrupts and
hypercalls), your memory-path (gpa translation), and
addressing/isolation model in-kernel to maximize performance.
vhost consumes them. Without KVM (or something equivalent), vhost is
incomplete. One of my goals with vbus is to generalize the "something
equivalent" part here.
No, vhost manages to accomplish this because of KVMs kernel supportVbus accomplishes its in-kernel isolation model by providing avhost manages to accomplish this without any kernel support.
"container" concept, where objects are placed into this container by
userspace. The host kernel enforces isolation/protection by using a
namespace to identify objects that is only relevant within a specific
container's context (namely, a "u32 dev-id"). The guest addresses the
objects by its dev-id, and the kernel ensures that the guest can't
access objects outside of its dev-id namespace.
(ioeventfd, etc). Without a KVM-like in-kernel support, vhost is a
merely a kind of "tuntap"-like clone signalled by eventfds.
This goes directly to my rebuttal of your claim that vbus places too
much in the kernel. I state that, one way or the other, address decode
and isolation _must_ be in the kernel for performance. Vbus does this
with a devid/container scheme. vhost+virtio-pci+kvm does it with
pci+pio+ioeventfd.
The guestYou mean _controlled_ by userspace, right? Obviously, the other side of
simply has not access to any vhost resources other than the guest->host
doorbell, which is handed to the guest outside vhost (so it's somebody
else's problem, in userspace).
the kernel still needs to be programmed (ioeventfd, etc). Otherwise,
vhost would be pointless: e.g. just use vanilla tuntap if you don't need
fast in-kernel decoding.
No, it doesn't avoid it. It just doesn't specify how its done, andAll that is required is a way to transport a message with a "devid"vhost avoids that.
attribute as an address (such as DEVCALL(devid)) and the framework
provides the rest of the decode+execute function.
relies on something else to do it on its behalf.
Conversely, vbus specifies how its done, but not how to transport the
verb "across the wire". That is the role of the vbus-connector abstraction.
Understood, but vhost+virtio-pci is what I am contrasting, and I useContrast this to vhost+virtio-pci (called simply "vhost" from here).It's the wrong name. vhost implements only the data path.
"vhost" for short from that point on because I am too lazy to type the
whole name over and over ;)
I meant vhost=vhost+virtio-pci here. Sorry for the confusion.It is not immune to requiring in-kernel addressing support either, butvhost does not rely on qemu. It relies on its user to handle
rather it just does it differently (and its not as you might expect via
qemu).
Vhost relies on QEMU to render PCI objects to the guest, which the guest
assigns resources (such as BARs, interrupts, etc).
configuration. In one important case it's qemu+pci. It could just as
well be the lguest launcher.
The point I am making specifically is that vhost in general relies on
other in-kernel components to function. I.e. It cannot function without
having something like the PCI model to build an IO namespace. That
namespace (in this case, pio addresses+data tuples) are used for the
in-kernel addressing function under KVM + virtio-pci.
The case of the lguest launcher is a good one to highlight. Yes, you
can presumably also use lguest with vhost, if the requisite facilities
are exposed to lguest-bus, and some eventfd based thing like ioeventfd
is written for the host (if it doesnt exist already).
And when the next virt design "foo" comes out, it can make a "foo-bus"
model, and implement foo-eventfd on the backend, etc, etc.
Ira can make ira-bus, and ira-eventfd, etc, etc.
Each iteration will invariably introduce duplicated parts of the stack.
For the N+1th time, no. vhost is perfectly usable without pci. Can weAgain, I understand vhost is decoupled from PCI, and I don't mean to
stop raising and debunking this point?
imply anything different. I use PCI as an example here because a) its
the only working example of vhost today (to my knowledge), and b) you
have stated in the past that PCI is the only "right" way here, to
paraphrase. Perhaps you no longer feel that way, so I apologize if you
feel you already recanted your position on PCI and I missed it.
I digress. My point here isn't PCI. The point here is the missing
component for when PCI is not present. The component that is partially
satisfied by vbus's devid addressing scheme. If you are going to use
vhost, and you don't have PCI, you've gotta build something to replace it.
I know you are probably being facetious here, but what do you proposeAll you really need is a simple decode+execute mechanism, and a way toIf you think it should be "commodotized", write libvhostconfig.so.
program it from userspace control. vbus tries to do just that:
commoditize it so all you need is the transport of the control messages
(like DEVCALL()), but the decode+execute itself is reuseable, even
across various environments (like KVM or Iras rig).
for the parts that must be in-kernel?
For a mac-address attribute? Thats all we are really talking aboutAnd your argument, I believe, is that vbus allows both to be implementedFlexibility is reduced, because changing code in the kernel is more
in the kernel (though to reiterate, its optional) and is therefore a bad
design, so lets discuss that.
I believe the assertion is that things like config-space are best left
to userspace, and we should only relegate fast-path duties to the
kernel. The problem is that, in my experience, a good deal of
config-space actually influences the fast-path and thus needs to
interact with the fast-path mechanism eventually anyway.
Whats left
over that doesn't fall into this category may cheaply ride on existing
plumbing, so its not like we created something new or unnatural just to
support this subclass of config-space.
expensive than in userspace, and kernel/user interfaces aren't typically
as wide as pure userspace interfaces. Security is reduced, since a bug
in the kernel affects the host, while a bug in userspace affects just on
guest.
here. These points you raise, while true of any kernel code I suppose,
are a bit of a stretch in this context.
Example: feature negotiation. If it happens in userspace, it's easy toIts not any harder in the kernel. I do this today.
limit what features we expose to the guest.
And when you are done negotiating said features, you will generally have
to turn around and program the feature into the backend anyway (e.g.
ioctl() to vhost module). Now you have to maintain some knowledge of
that particular feature and how to program it in two places.
Conversely, I am eliminating the (unnecessary) middleman by letting the
feature negotiating take place directly between the two entities that
will consume it.
If it happens in theYou need this already either way for both models anyway. As an added
kernel, we need to add an interface to let the kernel know which
features it should expose to the guest.
bonus, vbus has generalized that interface using sysfs attributes, so
all models are handled in a similar and community accepted way.
We also need to add anCan you elaborate on the requirements for live-migration? Wouldnt an
interface to let userspace know which features were negotiated, if we
want to implement live migration. Something fairly trivial bloats rapidly.
opaque save/restore model work here? (e.g. why does userspace need to be
able to interpret the in-kernel state, just pass it along as a blob to
the new instance).
As you can see above, userspace needs to be involved in this, and theActually, no. My experience has been the opposite. Anytime I sat down
number of interfaces required is smaller if it's in userspace:
and tried to satisfy your request to move things to the userspace,
things got ugly and duplicative really quick. I suspect part of the
reason you may think its easier because you already have part of
virtio-net in userspace and its surrounding support, but that is not the
case moving forward for new device types.
you onlyThats fine. vbus is targetted for high-performance IO. So if you have
need to know which features the kernel supports (they can be enabled
unconditionally, just not exposed).
Further, some devices are perfectly happy to be implemented in
userspace, so we need userspace configuration support anyway. Why
reimplement it in the kernel?
a robust userspace (like KVM+QEMU) and low-performance constraints (say,
for a console or something), put it in userspace and vbus is not
involved. I don't care.