Re: [RFC] Unify KVM kernel-space and user-space code into a singleproject

From: Anthony Liguori
Date: Mon Mar 22 2010 - 14:35:52 EST


On 03/22/2010 12:34 PM, Ingo Molnar wrote:
* Avi Kivity<avi@xxxxxxxxxx> wrote:

- Easy default reference to guest instances, and a way for tools to
reference them symbolically as well in the multi-guest case. Preferably
something trustable and kernel-provided - not some indirect information
like a PID file created by libvirt-manager or so.
Usually 'layering violation' is trotted out at such suggestions.
[...]
That's weird, how can a feature request be a 'layering violation'?
The 'something trustable and kernel-provided'. The kernel knows nothing
about guest names.
The kernel certainly knows about other resources such as task names or network
interface names or tracepoint names. This is kernel design 101.

If something that users find straightforward and usable is a layering
violation to you (such as easily being able to access their own files on
the host as well ...) then i think you need to revisit the definition of
that term instead of trying to fix the user.
Here is the explanation, you left it quoted:

[...] I don't like using the term, because sometimes the layers are
incorrect and need to be violated. But it should be done explicitly, not
as a shortcut for a minor feature (and profiling is a minor feature, most
users will never use it, especially guest-from-host).

The fact is we have well defined layers today, kvm virtualizes the cpu
and memory, qemu emulates devices for a single guest, libvirt manages
guests. We break this sometimes but there has to be a good reason. So
perf needs to talk to libvirt if it wants names. Could be done via
linking, or can be done using a pluging libvirt drops into perf.
This is really just the much-discredited microkernel approach for keeping
global enumeration data that should be kept by the kernel ...

Lets look at the ${HOME}/.qemu/qmp/ enumeration method suggested by Anthony.
There's numerous ways that this can break:

- Those special files can get corrupted, mis-setup, get out of sync, or can
be hard to discover.

- The ${HOME}/.qemu/qmp/ solution suggested by Anthony has a very obvious
design flaw: it is per user. When i'm root i'd like to query _all_ current
guest images, not just the ones started by root. A system might not even
have a notion of '${HOME}'.

- Apps might start KVM vcpu instances without adhering to the
${HOME}/.qemu/qmp/ access method.

Not all KVM vcpus are running operating systems.

Transitive had a product that was using a KVM context to run their binary translator which allowed them full access to the host processes virtual address space range. In this case, there is no kernel and there are no devices.

That's what I mean by a guest being a userspace context. KVM simply provides a new CPU mode to userspace in the same way that vm8086 mode.

Regards,

Anthony Liguori

- There is no guarantee for the Qemu process to reply to a request - while
the kernel can always guarantee an enumeration result. I dont want 'perf
kvm' to hang or misbehave just because Qemu has hung.

Really, for such reasons user-space is pretty poor at doing system-wide
enumeration and resource management. Microkernels lost for a reason.

You are committing several grave design mistakes here.

Thanks,

Ingo

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/