Re: [PATCH] KVM: x86: Use "checked" versions of get_user() and put_user()

From: Vitaly Kuznetsov

Date: Fri Nov 07 2025 - 08:10:00 EST


Sean Christopherson <seanjc@xxxxxxxxxx> writes:

> Use the normal, checked versions for get_user() and put_user() instead of
> the double-underscore versions that omit range checks, as the checked
> versions are actually measurably faster on modern CPUs (12%+ on Intel,
> 25%+ on AMD).
>
> The performance hit on the unchecked versions is almost entirely due to
> the added LFENCE on CPUs where LFENCE is serializing (which is effectively
> all modern CPUs), which was added by commit 304ec1b05031 ("x86/uaccess:
> Use __uaccess_begin_nospec() and uaccess_try_nospec"). The small
> optimizations done by commit b19b74bc99b1 ("x86/mm: Rework address range
> check in get_user() and put_user()") likely shave a few cycles off, but
> the bulk of the extra latency comes from the LFENCE.
>
> Don't bother trying to open-code an equivalent for performance reasons, as
> the loss of inlining (e.g. see commit ea6f043fc984 ("x86: Make __get_user()
> generate an out-of-line call") is largely a non-factor (ignoring setups
> where RET is something entirely different),
>
> As measured across tens of millions of calls of guest PTE reads in
> FNAME(walk_addr_generic):
>
> __get_user() get_user() open-coded open-coded, no LFENCE
> Intel (EMR) 75.1 67.6 75.3 65.5
> AMD (Turin) 68.1 51.1 67.5 49.3
>
> Reported-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Closes: https://lore.kernel.org/all/CAHk-=wimh_3jM9Xe8Zx0rpuf8CPDu6DkRCGb44azk0Sz5yqSnw@xxxxxxxxxxxxxx
> Cc: Borislav Petkov <bp@xxxxxxxxx>
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
> arch/x86/kvm/hyperv.c | 2 +-
> arch/x86/kvm/mmu/paging_tmpl.h | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 38595ecb990d..de92292eb1f5 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -1568,7 +1568,7 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
> * only, there can be valuable data in the rest which needs
> * to be preserved e.g. on migration.
> */
> - if (__put_user(0, (u32 __user *)addr))

Did some history digging on this one, apparently it appeared with

commit 8b0cedff040b652f3d36b1368778667581b0c140
Author: Xiao Guangrong <xiaoguangrong@xxxxxxxxxxxxxx>
Date: Sun May 15 23:22:04 2011 +0800

KVM: use __copy_to_user/__clear_user to write guest page

and the justification was:

Simply use __copy_to_user/__clear_user to write guest page since we have
already verified the user address when the memslot is set

Unlike FNAME(walk_addr_generic), I don't belive kvm_hv_set_msr() is
actually performance critical, normally behaving guests/userspaces
should never be doing extensive writing to
HV_X64_MSR_VP_ASSIST_PAGE. I.e. we can probably ignore the performance
aspect of this change completely.

> + if (put_user(0, (u32 __user *)addr))
> return 1;
> hv_vcpu->hv_vapic = data;
> kvm_vcpu_mark_page_dirty(vcpu, gfn);
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index ed762bb4b007..901cd2bd40b8 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -402,7 +402,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
> goto error;
>
> ptep_user = (pt_element_t __user *)((void *)host_addr + offset);
> - if (unlikely(__get_user(pte, ptep_user)))
> + if (unlikely(get_user(pte, ptep_user)))
> goto error;
> walker->ptep_user[walker->level - 1] = ptep_user;
>
>
> base-commit: a996dd2a5e1ec54dcf7d7b93915ea3f97e14e68a

--
Vitaly