[PATCH 2/2] KVM: x86/mmu: Simplify code for aging SPTEs in TDP MMU

From: Sean Christopherson
Date: Tue Mar 30 2021 - 20:50:41 EST


Use a basic NOT+AND sequence to clear the Accessed bit in TDP MMU SPTEs,
as opposed to the fancy ffs()+clear_bit() logic that was copied from the
legacy MMU. The legacy MMU uses clear_bit() because it is operating on
the SPTE itself, i.e. clearing needs to be atomic. The TDP MMU operates
on a local variable that it later writes to the SPTE, and so doesn't need
to be atomic or even resident in memory.

Opportunistically drop unnecessary initialization of new_spte, it's
guaranteed to be written before being accessed.

Using NOT+AND instead of ffs()+clear_bit() reduces the sequence from:

0x0000000000058be6 <+134>: test %rax,%rax
0x0000000000058be9 <+137>: je 0x58bf4 <age_gfn_range+148>
0x0000000000058beb <+139>: test %rax,%rdi
0x0000000000058bee <+142>: je 0x58cdc <age_gfn_range+380>
0x0000000000058bf4 <+148>: mov %rdi,0x8(%rsp)
0x0000000000058bf9 <+153>: mov $0xffffffff,%edx
0x0000000000058bfe <+158>: bsf %eax,%edx
0x0000000000058c01 <+161>: movslq %edx,%rdx
0x0000000000058c04 <+164>: lock btr %rdx,0x8(%rsp)
0x0000000000058c0b <+171>: mov 0x8(%rsp),%r15

to:

0x0000000000058bdd <+125>: test %rax,%rax
0x0000000000058be0 <+128>: je 0x58beb <age_gfn_range+139>
0x0000000000058be2 <+130>: test %rax,%r8
0x0000000000058be5 <+133>: je 0x58cc0 <age_gfn_range+352>
0x0000000000058beb <+139>: not %rax
0x0000000000058bee <+142>: and %r8,%rax
0x0000000000058bf1 <+145>: mov %rax,%r15

thus eliminating several memory accesses, including a locked access.

Cc: Ben Gardon <bgardon@xxxxxxxxxx>
Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
---
arch/x86/kvm/mmu/tdp_mmu.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 724088bea4b0..161b77925a19 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -951,7 +951,7 @@ static int age_gfn_range(struct kvm *kvm, struct kvm_memory_slot *slot,
{
struct tdp_iter iter;
int young = 0;
- u64 new_spte = 0;
+ u64 new_spte;

rcu_read_lock();

@@ -966,8 +966,7 @@ static int age_gfn_range(struct kvm *kvm, struct kvm_memory_slot *slot,
new_spte = iter.old_spte;

if (spte_ad_enabled(new_spte)) {
- clear_bit((ffs(shadow_accessed_mask) - 1),
- (unsigned long *)&new_spte);
+ new_spte &= ~shadow_accessed_mask;
} else {
/*
* Capture the dirty status of the page, so that it doesn't get
--
2.31.0.291.g576ba9dcdaf-goog