Re: [RFC PATCH 0/3] generic hypercall support

From: Avi Kivity
Date: Fri May 08 2009 - 04:19:00 EST


Gregory Haskins wrote:
Anthony Liguori wrote:
Gregory Haskins wrote:
Today, there is no equivelent of a platform agnostic "iowrite32()" for
hypercalls so the driver would look like the pseudocode above except
substitute with kvm_hypercall(), lguest_hypercall(), etc. The proposal
is to allow the hypervisor to assign a dynamic vector to resources in
the backend and convey this vector to the guest (such as in PCI
config-space as mentioned in my example use-case). The provides the
"address negotiation" function that would normally be done for something
like a pio port-address. The hypervisor agnostic driver can then use
this globally recognized address-token coupled with other device-private
ABI parameters to communicate with the device. This can all occur
without the core hypervisor needing to understand the details beyond the
addressing.
PCI already provide a hypervisor agnostic interface (via IO regions). You have a mechanism for devices to discover which regions they have
allocated and to request remappings. It's supported by Linux and
Windows. It works on the vast majority of architectures out there today.

Why reinvent the wheel?

I suspect the current wheel is square. And the air is out. Plus its
pulling to the left when I accelerate, but to be fair that may be my
alignment....

No, your wheel is slightly faster on the highway, but doesn't work at all off-road.

Consider nested virtualization where the host (H) runs a guest (G1) which is itself a hypervisor, running a guest (G2). The host exposes a set of virtio (V1..Vn) devices for guest G1. Guest G1, rather than creating a new virtio devices and bridging it to one of V1..Vn, assigns virtio device V1 to guest G2, and prays.

Now guest G2 issues a hypercall. Host H traps the hypercall, sees it originated in G1 while in guest mode, so it injects it into G1. G1 examines the parameters but can't make any sense of them, so it returns an error to G2.

If this were done using mmio or pio, it would have just worked. With pio, H would have reflected the pio into G1, G1 would have done the conversion from G2's port number into G1's port number and reissued the pio, finally trapped by H and used to issue the I/O. With mmio, G1 would have set up G2's page tables to point directly at the addresses set up by H, so we would actually have a direct G2->H path. Of course we'd need an emulated iommu so all the memory references actually resolve to G2's context.

So the upshoot is that hypercalls for devices must not be the primary method of communications; they're fine as an optimization, but we should always be able to fall back on something else. We also need to figure out how G1 can stop V1 from advertising hypercall support.

--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/