Re: [PATCH 1/2] KVM: x86: avoid simultaneous queueing of both IRQ and SMI

From: Wanpeng Li
Date: Sat Jun 04 2016 - 23:28:52 EST


2016-06-02 2:06 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>:
>
>
> On 01/06/2016 18:40, Radim KrÄmÃÅ wrote:
>> 2016-06-01 14:35+0200, Paolo Bonzini:
>>> If the processor exits to KVM while delivering an interrupt,
>>> the hypervisor then requeues the interrupt for the next vmentry.
>>> Trying to enter SMM in this same window causes to enter non-root
>>> mode in emulated SMM (i.e. with IF=0) and with a request to
>>> inject an IRQ (i.e. with a valid VM-entry interrupt info field).
>>> This is invalid guest state (SDM 26.3.1.4 "Check on Guest RIP
>>> and RFLAGS") and the processor fails vmentry.
>>>
>>> The fix is to defer the injection from KVM_REQ_SMI to KVM_REQ_EVENT,
>>> like we already do for e.g. NMIs. This patch doesn't change the
>>> name of the process_smi function so that it can be applied to
>>> stable releases. The next patch will modify the names so that
>>> process_nmi and process_smi process respectively KVM_REQ_NMI and
>>> KVM_REQ_SMI.
>>>
>>> This is especially common with Windows, probably due to the
>>> self-IPI trick that it uses to deliver deferred procedure
>>> calls (DPCs).
>>>
>>> Reported-by: Laszlo Ersek <lersek@xxxxxxxxxx>
>>> Fixes: 64d6067057d9658acb8675afcfba549abdb7fc16
>>> Cc: stable@xxxxxxxxxxxxxxx
>>> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
>>> ---
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> @@ -6098,7 +6094,10 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool req_int_win)
>>> }
>>>
>>> /* try to inject new event if pending */
>>> - if (vcpu->arch.nmi_pending && kvm_x86_ops->nmi_allowed(vcpu)) {
>>> + if (vcpu->arch.smi_pending && !is_smm(vcpu)) {
>>
>> Clearing smi_pending in kvm_vcpu_reset() would be safer now that SMI can
>> be injected without a request or RSM.
>
> Indeed.
>
>>> + --vcpu->arch.smi_pending;
>>
>> (I'd use 'vcpu->arch.smi_pending = false', to make it clearer that we
>> don't want multiple pending SMIs, unlike NMIs. smi_pending is bool,
>> so the generated code should be identical.)
>
> Right. Making the code superficially similar for SMI and NMI was nice;
> however, as discussed a while ago we could probably make nmi_pending a
> bool too.
>
>>> + process_smi(vcpu);
>>> + } else if (vcpu->arch.nmi_pending && kvm_x86_ops->nmi_allowed(vcpu)) {
>>> --vcpu->arch.nmi_pending;
>>
>>
>>> @@ -6621,8 +6631,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>>
>>> kvm_load_guest_xcr0(vcpu);
>>>
>>> - if (req_immediate_exit)
>>> + if (req_immediate_exit) {
>>> + kvm_make_request(KVM_REQ_EVENT, vcpu);
>>> smp_send_reschedule(vcpu->cpu);
>>
>> (Is this a fix for non-smi cases too?)
>
> No, I don't think so, the existing req_immediate_exit case is only after
> a VMLAUNCH/VMRESUME vmexit, in which case we already have a
>
> if (vmx->nested.nested_run_pending)
> kvm_make_request(KVM_REQ_EVENT, vcpu);
>
> in vmx_vcpu_run.

Do you think this can be removed since it blindly request a
KVM_REQ_EVENT even if there is no still-pending event to L1 which
blocked by nested_run_pending, however, req_immediate_exit can
indicate there is a pending event blocked by nested_run_pending and
the request KVM_REQUEST_EVENT added in your patch can guarantee inject
this pending event in the next nested vmexit.

Regards,
Wanpeng Li