Re: [PATCH v14 29/44] arm64: RMI: Runtime faulting of memory
From: Steven Price
Date: Mon Jun 08 2026 - 07:09:50 EST
On 05/06/2026 12:20, Gavin Shan wrote:
> Hi Steve,
>
> On 5/13/26 11:17 PM, Steven Price wrote:
>> At runtime if the realm guest accesses memory which hasn't yet been
>> mapped then KVM needs to either populate the region or fault the guest.
>>
>> For memory in the lower (protected) region of IPA a fresh page is
>> provided to the RMM which will zero the contents. For memory in the
>> upper (shared) region of IPA, the memory from the memslot is mapped
>> into the realm VM non secure.
>>
>> Signed-off-by: Steven Price <steven.price@xxxxxxx>
>> ---
>> Changes since v13:
>> * Numerous changes due to rebasing.
>> * Fix addr_range_desc() to encode the correct block size.
>> Changes since v12:
>> * Switch to RMM v2.0 range based APIs.
>> Changes since v11:
>> * Adapt to upstream changes.
>> Changes since v10:
>> * RME->RMI renaming.
>> * Adapt to upstream gmem changes.
>> Changes since v9:
>> * Fix call to kvm_stage2_unmap_range() in kvm_free_stage2_pgd() to set
>> may_block to avoid stall warnings.
>> * Minor coding style fixes.
>> Changes since v8:
>> * Propagate the may_block flag.
>> * Minor comments and coding style changes.
>> Changes since v7:
>> * Remove redundant WARN_ONs for realm_create_rtt_levels() - it will
>> internally WARN when necessary.
>> Changes since v6:
>> * Handle PAGE_SIZE being larger than RMM granule size.
>> * Some minor renaming following review comments.
>> Changes since v5:
>> * Reduce use of struct page in preparation for supporting the RMM
>> having a different page size to the host.
>> * Handle a race when delegating a page where another CPU has faulted on
>> a the same page (and already delegated the physical page) but not yet
>> mapped it. In this case simply return to the guest to either use the
>> mapping from the other CPU (or refault if the race is lost).
>> * The changes to populate_par_region() are moved into the previous
>> patch where they belong.
>> Changes since v4:
>> * Code cleanup following review feedback.
>> * Drop the PTE_SHARED bit when creating unprotected page table entries.
>> This is now set by the RMM and the host has no control of it and the
>> spec requires the bit to be set to zero.
>> Changes since v2:
>> * Avoid leaking memory if failing to map it in the realm.
>> * Correctly mask RTT based on LPA2 flag (see rtt_get_phys()).
>> * Adapt to changes in previous patches.
>> ---
>> arch/arm64/include/asm/kvm_emulate.h | 8 ++
>> arch/arm64/include/asm/kvm_rmi.h | 12 ++
>> arch/arm64/kvm/mmu.c | 128 ++++++++++++++++----
>> arch/arm64/kvm/rmi.c | 173 +++++++++++++++++++++++++++
>> 4 files changed, 301 insertions(+), 20 deletions(-)
>>
>
> [...]
>
>> @@ -1604,27 +1641,52 @@ static int gmem_abort(const struct
>> kvm_s2_fault_desc *s2fd)
>> bool write_fault, exec_fault;
>> enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_SHARED;
>> enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
>> - struct kvm_pgtable *pgt = s2fd->vcpu->arch.hw_mmu->pgt;
>> + struct kvm_vcpu *vcpu = s2fd->vcpu;
>> + struct kvm_pgtable *pgt = vcpu->arch.hw_mmu->pgt;
>> + gpa_t gpa = kvm_gpa_from_fault(vcpu->kvm, s2fd->fault_ipa);
>> unsigned long mmu_seq;
>> struct page *page;
>> - struct kvm *kvm = s2fd->vcpu->kvm;
>> + struct kvm *kvm = vcpu->kvm;
>> void *memcache;
>> kvm_pfn_t pfn;
>> gfn_t gfn;
>> int ret;
>> - memcache = get_mmu_memcache(s2fd->vcpu);
>> - ret = topup_mmu_memcache(s2fd->vcpu, memcache);
>> + if (kvm_is_realm(vcpu->kvm)) {
>> + /* check for memory attribute mismatch */
>> + bool is_priv_gfn = kvm_mem_is_private(kvm, gpa >> PAGE_SHIFT);
>> + /*
>> + * For Realms, the shared address is an alias of the private
>> + * PA with the top bit set. Thus if the fault address matches
>> + * the GPA then it is the private alias.
>> + */
>> + bool is_priv_fault = (gpa == s2fd->fault_ipa);
>> +
>> + if (is_priv_gfn != is_priv_fault) {
>> + kvm_prepare_memory_fault_exit(vcpu, gpa, PAGE_SIZE,
>> + kvm_is_write_fault(vcpu),
>> + false,
>> + is_priv_fault);
>> + /*
>> + * KVM_EXIT_MEMORY_FAULT requires an return code of
>> + * -EFAULT, see the API documentation
>> + */
>> + return -EFAULT;
>> + }
>> + }
>> +
>
> For a Realm, gmem_abort() is called by kvm_handle_guest_abort() only when
> we're faulting in the private (protected) space.
>
> if (kvm_slot_has_gmem(memslot) && !shared_ipa_fault(vcpu->kvm,
> fault_ipa))
> ret = gmem_abort(&s2fd);
> else
> ret = user_mem_abort(&s2fd);
>
> With the condition, this block of code can be simplied to handle conversion
> (shared -> private) instead of both directions.
>
> /* Convert the shared address to the private adress for Realm */
> if (kvm_is_realm(vcpu->kvm) &&
> !kvm_mem_is_private(kvm, gpa >> PAGE_SHIFT)) {
> /*
> * KVM_EXIT_MEMORY_FAULT requires an return code of
> * -EFAULT, see the API documentation
> */
> kvm_prepare_memory_fault_exit(vcpu, gpa, PAGE_SIZE,
> kvm_is_write_fault(vcpu),
> false, true);
> return -EFAULT;
> }
>
>
> [...]
>
>> @@ -2396,7 +2475,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>> !write_fault &&
>> !kvm_vcpu_trap_is_exec_fault(vcpu));
>> - if (kvm_slot_has_gmem(memslot))
>> + if (kvm_slot_has_gmem(memslot) && !shared_ipa_fault(vcpu-
>> >kvm, fault_ipa))
>> ret = gmem_abort(&s2fd);
>> else
>> ret = user_mem_abort(&s2fd);
> gmem_abort() is only called for faults in the protected (private) space.
You're absolutely correct - that's a nice simplification!
Thanks,
Steve