Re: [PATCH v3 05/57] KVM: x86: Account for KVM-reserved CR4 bits when passing through CR4 on VMX
From: Chao Gao
Date: Thu Dec 12 2024 - 20:30:54 EST
On Wed, Nov 27, 2024 at 05:33:32PM -0800, Sean Christopherson wrote:
>Drop x86.c's local pre-computed cr4_reserved bits and instead fold KVM's
>reserved bits into the guest's reserved bits. This fixes a bug where VMX's
>set_cr4_guest_host_mask() fails to account for KVM-reserved bits when
>deciding which bits can be passed through to the guest. In most cases,
>letting the guest directly write reserved CR4 bits is ok, i.e. attempting
>to set the bit(s) will still #GP, but not if a feature is available in
>hardware but explicitly disabled by the host, e.g. if FSGSBASE support is
>disabled via "nofsgsbase".
>
>Note, the extra overhead of computing host reserved bits every time
>userspace sets guest CPUID is negligible. The feature bits that are
>queried are packed nicely into a handful of words, and so checking and
>setting each reserved bit costs in the neighborhood of ~5 cycles, i.e. the
>total cost will be in the noise even if the number of checked CR4 bits
>doubles over the next few years. In other words, x86 will run out of CR4
>bits long before the overhead becomes problematic.
>
>Note #2, __cr4_reserved_bits() starts from CR4_RESERVED_BITS, which is
>why the existing __kvm_cpu_cap_has() processing doesn't explicitly OR in
>CR4_RESERVED_BITS (and why the new code doesn't do so either).
>
>Fixes: 2ed41aa631fc ("KVM: VMX: Intercept guest reserved CR4 bits to inject #GP fault")
>Reviewed-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
>Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
Reviewed-by: Chao Gao <chao.gao@xxxxxxxxx>