Re: [PATCH 1/7] KVM: userspace interface
From: Anthony Liguori
Date: Thu Oct 19 2006 - 15:09:32 EST
Avi Kivity wrote:
Anthony Liguori wrote:
Sorry if I missed this, but can you provide a link to the QEMU changes?
I'll do that once I get my sourceforge page and post it here. Watch
this space.
It's hard to tell what's going on without seeing the userspace
portions of this.
My initial impression is that you've taken the Xen approach of trying
to use QEMU only for IO emulation. If this is the case, it won't
work long term. While you can use vm86 mode for 16 bit
virtualization for most cases, it cannot handle big real mode. You
need the ability to transfer down to QEMU and allow it to do emulation.
We started using VT only for 64 bit, then added 32 bit, then 16-bit
protected, then vm86 and real mode. We'd transfer the x86 state on
each mode change, but it was (a) fragile (b) considered unclean.
You're right that "big real" mode is not supported, but so far that
hasn't been a problem. Do you know of an OS that needs big real mode?
AFAIK the SLES boot splash patches to grub use it. It's definitely a
requirement. Currently, there is an effort in Xen to use QEMU for
partial emulation. Hopefully, it will be there for the next release.
Allowing QEMU to do emulation also will help with IO performance.
Instead of having to take many trips to userspace for MMIO especially,
you can allow QEMU to execute a certain number of basic blocks and then
return. Minimizing trips between userspace and the kernel is going to
be critical performance wise.
Ideally, instead of having as large of an x86 emulator in kernel
space, you would just drop down to QEMU to do emulation as needed
(doing only a single basic block and returning). This would let you
have a much reduced partial emulator in kernel space that only did
the most common (and performance critical) instructions.
Over time that emulator would grow as OSes and compilers evolve... and
we'd really like to keep basic things like the apic in the kernel (as
does Xen).
I've been tossing around the idea of doing partial IO emulation in the
kernel. If you could sync the device states between userspace and
kernel, it should be possible. Given that the you're already in the
kernel at VMEXIT time, if you could feed something right to the block
driver or network driver, you ought to be able to get pretty darn good
performance.
However, I do agree that it's better to start simple. I actually think
you could simplify more by using QEMU for more instruction emulation and
focus only on the hand full of instructions in the critical path for the
kernel.
Regards,
Anthony Liguori
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/