Re: [PATCH] KVM: x86: fix missed memory synchronization when patchhypercall

From: Paolo Bonzini
Date: Tue Jun 18 2013 - 10:13:15 EST


Il 09/06/2013 14:27, Gleb Natapov ha scritto:
> On Sun, Jun 09, 2013 at 08:17:19PM +0800, Xiao Guangrong wrote:
>> On 06/09/2013 07:56 PM, Gleb Natapov wrote:
>>> On Sun, Jun 09, 2013 at 07:44:03PM +0800, Xiao Guangrong wrote:
>>>> On 06/09/2013 07:36 PM, Gleb Natapov wrote:
>>>>> On Sun, Jun 09, 2013 at 07:25:17PM +0800, Xiao Guangrong wrote:
>>>>>> On 06/09/2013 06:19 PM, Gleb Natapov wrote:
>>>>>>> On Sun, Jun 09, 2013 at 06:01:45PM +0800, Xiao Guangrong wrote:
>>>>>>>> On 06/09/2013 05:39 PM, Gleb Natapov wrote:
>>>>>>>>> On Sun, Jun 09, 2013 at 05:29:37PM +0800, Xiao Guangrong wrote:
>>>>>>>>>> On 06/09/2013 04:45 PM, Gleb Natapov wrote:
>>>>>>>>>>
>>>>>>>>>>> +static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
>>>>>>>>>>> +{
>>>>>>>>>>> + struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
>>>>>>>>>>> + return kvm_exec_with_stopped_vcpu(vcpu->kvm,
>>>>>>>>>>> + emulator_fix_hypercall_cb, ctxt);
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +
>>>>>>>>>>> /*
>>>>>>>>>>> * Check if userspace requested an interrupt window, and that the
>>>>>>>>>>> * interrupt window is open.
>>>>>>>>>>> @@ -5761,6 +5769,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>>>>>>>>>> kvm_deliver_pmi(vcpu);
>>>>>>>>>>> if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
>>>>>>>>>>> vcpu_scan_ioapic(vcpu);
>>>>>>>>>>> + if (kvm_check_request(KVM_REQ_STOP_VCPU, vcpu)){
>>>>>>>>>>> + mutex_lock(&vcpu->kvm->lock);
>>>>>>>>>>> + mutex_unlock(&vcpu->kvm->lock);
>>>>>>>>>>
>>>>>>>>>> We should execute a serializing instruction here?
>>>>>>>>>>
>>>>>>>>>>> --- a/virt/kvm/kvm_main.c
>>>>>>>>>>> +++ b/virt/kvm/kvm_main.c
>>>>>>>>>>> @@ -222,6 +222,18 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
>>>>>>>>>>> make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> +int kvm_exec_with_stopped_vcpu(struct kvm *kvm, int (*cb)(void *), void *data)
>>>>>>>>>>> +{
>>>>>>>>>>> + int r;
>>>>>>>>>>> +
>>>>>>>>>>> + mutex_lock(&kvm->lock);
>>>>>>>>>>> + make_all_cpus_request(kvm, KVM_REQ_STOP_VCPU);
>>>>>>>>>>> + r = cb(data);
>>>>>>>>>>
>>>>>>>>>> And here?
>>>>>>>>> Since the serialisation instruction the SDM suggest to use is CPUID I
>>>>>>>>> think the point here is to flush CPU pipeline. Since all vcpus are out
>>>>>>>>> of a guest mode I think out of order execution of modified instruction
>>>>>>>>> is no an issue here.
>>>>>>>>
>>>>>>>> I checked the SDM that it did not said VMLAUNCH/VMRESUME are the
>>>>>>>> serializing instructions both in VM-Entry description and Instruction
>>>>>>>> reference, instead it said the VMX related serializing instructions are:
>>>>>>>> INVEPT, INVVPID.
>>>>>>>>
>>>>>>>> So, i guess the explicit serializing instruction is needed here.
>>>>>>>>
>>>>>>> Again the question is what for? SDM says:
>>>>>>>
>>>>>>> The Intel 64 and IA-32 architectures define several serializing
>>>>>>> instructions. These instructions force the processor to complete all
>>>>>>> modifications to flags, registers, and memory by previous instructions
>>>>>>> and to drain all buffered writes to memory before the next instruction
>>>>>>> is fetched and executed.
>>>>>>>
>>>>>>> So flags and registers modifications on a host are obviously irrelevant for a guest.
>>>>>>
>>>>>> Okay. Hmm... but what can guarantee that "drain all buffered writes to memory"?
>>>>> Memory barrier should guaranty that as I said bellow.
>>>>>
>>>>>>
>>>>>>> And for memory ordering we have smp_mb() on a guest entry.
>>>>>>
>>>>>> If i understand the SDM correctly, memory-ordering instructions can not drain
>>>>>> instruction buffer, it only drains "data memory subsystem":
>>>>> What is "instruction buffer"?
>>>>
>>>> I mean "Instruction Cache" (icache). Can memory ordering drain icache?
>>>> The "data memory subsystem" confused me, does it mean dcache?
>>>>
>>> I think it means all caches.
>>> 11.6 says:
>>>
>>> A write to a memory location in a code segment that is currently
>>> cached in the processor causes the associated cache line (or lines)
>>> to be invalidated. This check is based on the physical address of
>>> the instruction. In addition, the P6 family and Pentium processors
>>> check whether a write to a code segment may modify an instruction that
>>> has been prefetched for execution. If the write affects a prefetched
>>> instruction, the prefetch queue is invalidated. This latter check is
>>> based on the linear address of the instruction. For the Pentium 4 and
>>> Intel Xeon processors, a write or a snoop of an instruction in a code
>>> segment, where the target instruction is already decoded and resident in
>>> the trace cache, invalidates the entire trace cache. The latter behavior
>>> means that programs that self-modify code can cause severe degradation
>>> of performance when run on the Pentium 4 and Intel Xeon processors.
>>>
>>> So icache line is invalidate based on physical address so we are OK.
>>
>> Yes.
>>
>>> Prefetched instruction is invalidated based on linear address, but if
>>> all vcpus are in a host guest instruction cannot be prefetched.
>>
>> But what happen if the instruction has been prefetched before vcpu exits
>> to host? Then, after returns to guest, it executes the old instruction.
>>
>> Can it happen?
> I do not thing so, prefetched instructions is not a cache, but I'll ask
> Intel.

Any news?

Anyway, if this were the case (which seems strange, but you never know),
CPUID would not help. The hypothetical guest prefetch queue would not
be flushed, and you'd need INVEPT/INVVPID as Xiao mentioned upthread.

Paolo

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/