Re: [PATCH] x86: kvm: reset the bootstrap processor when it gets anINIT

From: Gleb Natapov
Date: Mon Mar 11 2013 - 09:54:49 EST


On Mon, Mar 11, 2013 at 02:31:46PM +0100, Paolo Bonzini wrote:
> Il 11/03/2013 12:51, Gleb Natapov ha scritto:
> >> >
> >> > Agreed, but we still have the problem of how to signal from userspace.
> >> > For that do you have any other suggestion than mp_state? And if we keep
> >> > mp_state to signal from userspace, giving INIT_RECEIVED the
> >> > "wait-for-SIPI" semantics would be wrong.
> >> >
> > I don't see how can we use mp_state for signaling from userspace either.
> > Currently soft reset always reset vcpus, so it is OK for userspace to
> > generate reset vcpu state and put it into kernel, mp_state is just one
> > of the updated states, but when INIT will be just another signal that
> > may or may not reset cpu or have other side effects like #vmexit this
> > will not longer work. We will have to have another interface for
> > injecting INIT from userspace and userspace soft-reset will use it
> > instead of doing reset by itself.
>
> Setting the mp_state to INIT_RECEIVED is that interface, and it already
> works, for APs at least. This patch extends it to work for the BSP as well.
>
It does not for AP either. If AP has vmx on mp_sate should not be set to
INIT_RECEIVED. mp_sate is a state as you can see from its name and we
already had a discussion on the generic device API about importance of
separating sending commands from setting state. There is a difference
between setting mp_sate during migration and signaling INIT#.

> In the corresponding userspace patch, I don't need to touch the CPU
> state at all. I can just signal the kernel. If I touch the CPU, I'll
> break the nested case, no matter how it is implemented. So far, the
> userspace did not have to worry about nested, and that's something that
> should be kept that way.
We are discussing two different things here. I'll try to separate them.
1. BSP is broken WRT #INIT
2. nested is broken WRT #INIT

You are fixing 1 with your patches, for that I proposed much easier
solution (at last from kernel point of view): if BSP reset it in
userspace and make it runnable. Nested virt is still broken, but this is
not what you are fixing.

For 2 much more involved fix is needed. Jan fixes it and it will require
signaling INIT# from userspace by other means than mp_sate because
signaling INIT# does not automatically means that mp_sate changes to
INIT_RECEIVED.

>
> If we move away from the INIT_RECEIVED and SIPI_RECEIVED states for
> in-kernel APIC -> VCPU communication, then the KVM_SET_MP_STATE ioctl
> will have to convert them to the right bits in the requests field or in
> the APIC state. But I'm starting to see less benefit from moving away
> from mp_state.
>
We are not moving away from mp_state, we are moving away from using
mp_state for signaling because with nested virt INIT does not always
change mp_state, not only that it can change mp_state long after signal
is received after vmx off is done.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/