RE: [PATCH v3 1/6] KVM: nVMX: Use hardware MSR bitmap

From: Zhang, Yang Z
Date: Wed Jan 28 2015 - 16:15:32 EST


Wincy Van wrote on 2015-01-28:
> On Wed, Jan 28, 2015 at 7:52 PM, Zhang, Yang Z <yang.z.zhang@xxxxxxxxx>
> wrote:
>>>>
>>>
>>> If L0 wants to intercept a msr, we should set
>>> vmx_msr_bitmap_legacy(_x2apic) and vmx_msr_bitmap_longmode(_x2apic),
>>> and that bitmaps should only be loaded in non-nested entry. Currently
>>> we only clear the corresponding bits if L1 enables virtualize x2apic
>>> mode, all the other bits are all set. On nested entry, we load
>>> nested_msr_bitmap, on nested vmexit, we will restore L0's bitmap. In
>>> the nested entry, L0 only care about L2's msr access, not L1's. I
>>> think that in nested entry, nested_msr_bitmap is "vmcs01->msr_bitmap"
>>> as your mentioned above.
>>
>> Mmm... I think if the bit is setting in vmcs01->msr_bitmap means
>> whenever
> the VCPU(no matter in nested guest mode or not) accesses this msr
> should cause vmexit, not only L1. That's why need to construct the
> vmcs02->msr_bitmap based on (vmcs01->msr_bitmap ORed
> vmcs12->msr_bitmap).
>>
>
> You are right, but this is not fit for all the cases, we should custom
> the nested_msr_bitmap.
> e.g. Currently L0 wants to intercept some of the x2apic msrs' reading:
> if (enable_apicv) {
> for (msr = 0x800; msr <= 0x8ff; msr++)
> vmx_disable_intercept_msr_read_x2apic(msr);
> /* According SDM, in x2apic mode, the whole id reg is used.
> * But in KVM, it only use the highest eight bits. Need to
> * intercept it */
> vmx_enable_intercept_msr_read_x2apic(0x802); /* TMCCT */
> vmx_enable_intercept_msr_read_x2apic(0x839); /* TPR */
> vmx_disable_intercept_msr_write_x2apic(0x808); /* EOI */
> vmx_disable_intercept_msr_write_x2apic(0x80b); /*
> SELF-IPI */
> vmx_disable_intercept_msr_write_x2apic(0x83f);
> }
> But L1 may not want this. So I think we are better to deal with the

Actually, from L0's point, it is totally unaware of the L2. The only thing L0 aware is that the CPU should follow L0's configuration when VCPU is running. So if L0 wants to trap a msr, then the read operation to this msr should cause vmexit unconditionally no matter who is running(who means L1, L2, L3.....).

> msrs seperately, there is not a common way suit for all the cases. If
> other features want to intercept a msr in nested entry, they can put
> the custom code in nested_vmx_merge_msr_bitmap.

Yes, if other features want to do it in 'nested' entry, they can fill nested_vmx_merge_msr_bitmap. But if in non-nested case, it should be our responsibly to handle it correctly, how about add following check:

if (type & MSR_TYPE_R && !test_bit(msr, vmcs01_msr_bitmap) && !test_bit(msr, msr_bitmap_l1 + 0x000 / f))
__clear_bit(msr, msr_bitmap_nested + 0x000 / f);


>
>
> Thanks,
> Wincy


Best regards,
Yang