Re: [PATCH] kvm: x86: disable KVM_FAST_MMIO_BUS

From: Paolo Bonzini
Date: Wed Aug 16 2017 - 17:25:47 EST


On 16/08/2017 21:59, Michael S. Tsirkin wrote:
> On Wed, Aug 16, 2017 at 09:03:17PM +0200, Radim KrÄmÃÅ wrote:
>> 2017-08-16 19:19+0200, Paolo Bonzini:
>>> On 16/08/2017 18:50, Michael S. Tsirkin wrote:
>>>> On Wed, Aug 16, 2017 at 03:30:31PM +0200, Paolo Bonzini wrote:
>>>>> While you can filter out instruction fetches, that's not enough. A data
>>>>> read could happen because someone pointed the IDT to MMIO area, and who
>>>>> knows what the VM-exit instruction length points to in that case.
>>>>
>>>> Thinking more about it, I don't really see how anything
>>>> legal guest might be doing with virtio would trigger anything
>>>> but a fault after decoding the instruction. How does
>>>> skipping instruction even make sense in the example you give?
>>>
>>> There's no such thing as a legal guest. Anything that the hypervisor
>>> does, that differs from real hardware, is a possible escalation path.
>>>
>>> This in fact makes me doubt the EMULTYPE_SKIP patch too.
>>
>> The main hack is that we expect EPT misconfig within a given range to be
>> a MMIO NULL write. I think it is fine -- EMULTYPE_SKIP is a common path
>> that should have well tested error paths and, IIUC, virtio doesn't allow
>> any other access, so it is a problem of the guest if a buggy/malicious
>> application can access virtio memory.

Yes, I agree. EMULTYPE_SKIP is fine because failed decoding still
causes an exception to be injected. Maybe it's better to gate the
EMULTYPE_SKIP emulation on the exit qualification saying this is a write
and also not a page table walk---just in case.

>>>> how about we blacklist nested virt for this optimization?
>>
>> Not every hypervisor can be easily detected ...
>
> Hypervisors that don't set a hypervisor bit in CPUID are violating the
> spec themselves, aren't they? Anyway, we can add a management option
> for use in a nested scenario.

No, the hypervisor bit only says that CPUID leaf 0x40000000 is defined.
See for example
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458:
"Intel and AMD have also reserved CPUID leaves 0x40000000 - 0x400000FF
for software use. Hypervisors can use these leaves to provide an
interface to pass information from the hypervisor to the guest operating
system running inside a virtual machine. The hypervisor bit indicates
the presence of a hypervisor and that it is safe to test these
additional software leaves".

>> KVM uses standard features and SDM clearly says that the
>> instruction length field is undefined.
>
> True. Let's see whether intel can commit to a stronger definition.
> I don't think there's any rush to make this change.

I disagree. Relying on undefined processor features is a bad idea.

> It's just that this has been there for 3 years and people have built a
> product around this.

Around 700 clock cycles?

Paolo