Re: [PATCH v2] KVM: VMX: Enable Notify VM exit
From: Jim Mattson
Date: Thu Jun 03 2021 - 09:36:06 EST
On Wed, Jun 2, 2021 at 6:25 PM Xiaoyao Li <xiaoyao.li@xxxxxxxxx> wrote:
> On 6/2/2021 6:31 PM, Vitaly Kuznetsov wrote:
> > Tao Xu <tao3.xu@xxxxxxxxx> writes:
> >> There are some cases that malicious virtual machines can cause CPU stuck
> >> (event windows don't open up), e.g., infinite loop in microcode when
> >> nested #AC (CVE-2015-5307). No event window obviously means no events,
> >> e.g. NMIs, SMIs, and IRQs will all be blocked, may cause the related
> >> hardware CPU can't be used by host or other VM.
> >> To resolve those cases, it can enable a notify VM exit if no event
> >> window occur in VMX non-root mode for a specified amount of time
> >> (notify window). Since CPU is first observed the risk of not causing
> >> forward progress, after notify window time in a units of crystal clock,
> >> Notify VM exit will happen. Notify VM exit can happen incident to delivery
> >> of a vectored event.
> >> Expose a module param for configuring notify window, which is in unit of
> >> crystal clock cycle.
> >> - A negative value (e.g. -1) is to disable this feature.
> >> - Make the default as 0. It is safe because an internal threshold is added
> >> to notify window to ensure all the normal instructions being coverd.
> >> - User can set it to a large value when they want to give more cycles to
> >> wait for some reasons, e.g., silicon wrongly kill some normal instruction
> >> due to internal threshold is too small.
> >> Notify VM exit is defined in latest Intel Architecture Instruction Set
> >> Extensions Programming Reference, chapter 9.2.
> >> Co-developed-by: Xiaoyao Li <xiaoyao.li@xxxxxxxxx>
> >> Signed-off-by: Xiaoyao Li <xiaoyao.li@xxxxxxxxx>
> >> Signed-off-by: Tao Xu <tao3.xu@xxxxxxxxx>
> >> ---
> >> Changelog:
> >> v2:
> >> Default set notify window to 0, less than 0 to disable.
> >> Add more description in commit message.
> > Sorry if this was already discussed, but in case of nested
> > virtualization and when L1 also enables
> > SECONDARY_EXEC_NOTIFY_VM_EXITING, shouldn't we just reflect NOTIFY exits
> > during L2 execution to L1 instead of crashing the whole L1?
> yes. If we expose it to nested, it should reflect the Notify VM exit to
> L1 when L1 enables it.
> But regarding nested, there are more things need to be discussed. e.g.,
> 1) It has dependence between L0 and L1, for security consideration. When
> L0 enables it, it shouldn't be turned off during L2 VM is running.
> a. Don't expose to L1 but enable for L1 when L2 VM is running.
> b. expose it to L1 and force it enabled.
> 2) When expose it to L1, vmcs02.notify_window needs to be
> min(L0.notify_window, L1.nofity_window)
I don't think this can be a simple 'min', since L1's clock may run at
a different frequency from L0's clock.