Re: [PATCH] KVM: x86: Fix potential memory access error

From: Sean Christopherson
Date: Wed Mar 31 2021 - 14:08:08 EST


On Wed, Mar 31, 2021, Yang Li wrote:
> Using __set_bit() to set a bit in an integer is not a good idea, since
> the function expects an unsigned long as argument, which can be 64bit wide.
> Coverity reports this problem as
>
> High:Out-of-bounds access(INCOMPATIBLE_CAST)
> CWE119: Out-of-bounds access to a scalar
> Pointer "&vcpu->arch.regs_avail" points to an object whose effective
> type is "unsigned int" (32 bits, unsigned) but is dereferenced as a
> wider "unsigned long" (64 bits, unsigned). This may lead to memory
> corruption.
>
> /home/heyuan.shy/git-repo/linux/arch/x86/kvm/kvm_cache_regs.h:
> kvm_register_is_available
>
> Just use BIT instead.

Meh, we're hosed either way. Using BIT() will either result in undefined
behavior due to SHL shifting beyond the size of a u64, or setting random bits
if the truncated shift ends up being less than 63.

I suppose one could argue that undefined behavior is better than memory
corruption, but KVM is very broken if 'reg' is out-of-bounds so IMO it's not
worth changing. There are only two call sites that don't use a hardcoded value,
and both are guarded by WARN. kvm_register_write() bails without calling
kvm_register_mark_dirty(), so that's guaranteed safe. vmx_cache_reg() WARNs
after kvm_register_mark_available(), but except for kvm_register_read(), all
calls to vmx_cache_reg() use a hardcoded value, and kvm_register_read() also
WARNs and bails.

Note, all of the above holds true for kvm_register_is_{available,dirty}(), too.

So in the current code, it's impossible for this to be a problem. Theoretically
future code could introduce bugs, but IMO we should never accept code that uses
a non-hardcoded 'reg' and doesn't pre-validate.

The number of uops is basically a wash because "BTS <reg>, <mem>" is fairly
expensive; depending on the uarch, the difference is ~1-2 uops in favor of BIT().
On the flip side, __set_bit() shaves 8 bytes. Of course, none these flows are
anywhere near that senstive.

TL;DR: I'm not opposed to using BIT(), I just don't see the point.


__set_bit():
0x00000000000104e6 <+6>: mov %esi,%eax
0x00000000000104e8 <+8>: mov %rdi,%rbp
0x00000000000104eb <+11>: sub $0x8,%rsp
0x00000000000104ef <+15>: bts %rax,0x2a0(%rdi)

|= BIT():
0x0000000000010556 <+6>: mov %esi,%ecx
0x0000000000010558 <+8>: mov $0x1,%eax
0x000000000001055d <+13>: mov %rdi,%rbp
0x0000000000010560 <+16>: sub $0x8,%rsp
0x0000000000010564 <+20>: shl %cl,%rax
0x0000000000010567 <+23>: or %eax,0x2a0(%rdi)

> Reported-by: Abaci Robot <abaci@xxxxxxxxxxxxxxxxx>
> Signed-off-by: Yang Li <yang.lee@xxxxxxxxxxxxxxxxx>
> ---
> arch/x86/kvm/kvm_cache_regs.h | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h
> index 2e11da2..cfa45d88 100644
> --- a/arch/x86/kvm/kvm_cache_regs.h
> +++ b/arch/x86/kvm/kvm_cache_regs.h
> @@ -52,14 +52,14 @@ static inline bool kvm_register_is_dirty(struct kvm_vcpu *vcpu,
> static inline void kvm_register_mark_available(struct kvm_vcpu *vcpu,
> enum kvm_reg reg)
> {
> - __set_bit(reg, (unsigned long *)&vcpu->arch.regs_avail);
> + vcpu->arch.regs_avail |= BIT(reg);
> }
>
> static inline void kvm_register_mark_dirty(struct kvm_vcpu *vcpu,
> enum kvm_reg reg)
> {
> - __set_bit(reg, (unsigned long *)&vcpu->arch.regs_avail);
> - __set_bit(reg, (unsigned long *)&vcpu->arch.regs_dirty);
> + vcpu->arch.regs_avail |= BIT(reg);
> + vcpu->arch.regs_dirty |= BIT(reg);
> }
>
> static inline unsigned long kvm_register_read(struct kvm_vcpu *vcpu, int reg)
> --
> 1.8.3.1
>