Re: [RFC] Unify KVM kernel-space and user-space code into a singleproject

From: Avi Kivity
Date: Wed Mar 24 2010 - 11:13:32 EST


On 03/24/2010 05:01 PM, Joerg Roedel wrote:

But when I weigh the benefit of truly transparent system-wide perf
integration for users who don't use libvirt but do use perf, versus
the cost of transforming kvm from a single-process API to a
system-wide API with all the complications that I've listed, it comes
out in favour of not adding the API.
Its not a transformation, its an extension. The current per-process
/dev/kvm stays mostly untouched. Its all about having something like
this:

$ cd /sys/kvm/guest0
$ ls -l
-r-------- 1 root root 0 2009-08-17 12:05 name
dr-x------ 1 root root 0 2009-08-17 12:05 fs
$ cat name
guest0
$ # ...

The fs/ directory is used as the mount point for the guest root fs.

The problem is /sys/kvm, not /sys/kvm/fs.

The samples will be tagged with the guest-name (and some additional
information perf needs). Perf userspace can access the symbols then
through /sys/kvm/guest0/fs/...
I take that as a yes? So we need a virtio-serial client in the kernel
(which might be exploitable by a malicious guest if buggy) and a
fs-over-virtio-serial client in the kernel (also exploitable).
What I meant was: perf-kernel puts the guest-name into every sample and
perf-userspace accesses /sys/kvm/guest_name/fs/ later to resolve the
symbols. I leave the question of how the guest-fs is exposed to the host
out of this discussion. We should discuss this seperatly.

How I see it: perf-kernel puts the guest pid into every sample, and perf-userspace uses that to resolve to a mountpoint served by fuse, or to a unix domain socket that serves the files.

An approach like: "The files are owned and only readable by the same
user that started the vm." might be a good start. So a user can measure
its own guests and root can measure all of them.
That's not how sVirt works. sVirt isolates a user's VMs from each
other, so if a guest breaks into qemu it can't break into other guests
owned by the same user.
If a vm breaks into qemu it can access the host file system which is the
bigger problem. In this case there is no isolation anymore. From that
context it can even kill other VMs of the same user independent of a
hypothetical /sys/kvm/.

It cannot. sVirt labels the disk image and other files qemu needs with the appropriate label, and everything else is off limits. Even if you run the guest as root, it won't have access to other files.

Yeah that would be interesting information. But it is more related to
tracing than to pmu measurements. The information which you
mentioned above are probably better captured by an extension of
trace-events to userspace.
It's all related. You start with perf, see a problem with mmio, call up
a histogram of mmio or interrupts or whatever, then zoom in on the
misbehaving device.
Yes, but its different from the implementation point-of-view. For the
user it surely all plays together.

We need qemu to cooperate for mmio tracing, and we can cooperate with qemu for symbol resolution. If it prevents adding another kernel API, that's a win from my POV.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/