Re: [Qemu-devel] [RFC] Next gen kvm api

From: Avi Kivity
Date: Sat Feb 18 2012 - 05:04:12 EST


On 02/17/2012 02:09 AM, Michael Ellerman wrote:
> On Thu, 2012-02-16 at 21:28 +0200, Avi Kivity wrote:
> > On 02/16/2012 03:04 AM, Michael Ellerman wrote:
> > > >
> > > > ioctl is good for hardware devices and stuff that you want to enumerate
> > > > and/or control permissions on. For something like KVM that is really a
> > > > core kernel service, a syscall makes much more sense.
> > >
> > > Yeah maybe. That distinction is at least in part just historical.
> > >
> > > The first problem I see with using a syscall is that you don't need one
> > > syscall for KVM, you need ~90. OK so you wouldn't do that, you'd use a
> > > multiplexed syscall like epoll_ctl() - or probably several
> > > (vm/vcpu/etc).
> >
> > No. Many of our ioctls are for state save/restore - we reduce that to
> > two. Many others are due to the with/without irqchip support - we slash
> > that as well. The device assignment stuff is relegated to vfio.
> >
> > I still have to draw up a concrete proposal, but I think we'll end up
> > with 10-15.
>
> That's true, you certainly could reduce it, though by how much I'm not
> sure. On powerpc I'm working on moving the irq controller emulation into
> the kernel, and some associated firmware emulation, so that's at least
> one new ioctl. And there will always be more, whatever scheme you have
> must be easily extensible - ie. not requiring new syscalls for each new
> weird platform.

Most of it falls into read/write state, which is covered by two
syscalls. There's probably need for configuration (wiring etc.); we
could call that pseudo-state with fake registers but I don't like that
very much.


> > > Secondly you still need a handle/context for those syscalls, and I think
> > > the most sane thing to use for that is an fd.
> >
> > The context is the process (for vm-wide calls) and thread (for vcpu
> > local calls).
>
> Yeah OK I forgot you'd mentioned that. But isn't that change basically
> orthogonal to how you get into the kernel? ie. we could have the
> kvm/vcpu pointers in mm_struct/task_struct today?
>
> I guess it wouldn't win you much though because you still have the fd
> and ioctl overhead as well.
>

Yes. I also dislike bypassing ioctl semantics (though we already do
that by requiring vcpus to stay on the same thread and vms on the same
process).

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/