Re: [RFC PATCH 08/28] arm64: RME: Keep a spare page delegated to the RMM

From: Zhi Wang
Date: Wed Mar 01 2023 - 15:50:36 EST


On Wed, 1 Mar 2023 11:55:37 +0000
Steven Price <steven.price@xxxxxxx> wrote:

> On 13/02/2023 16:47, Zhi Wang wrote:
> > On Fri, 27 Jan 2023 11:29:12 +0000
> > Steven Price <steven.price@xxxxxxx> wrote:
> >
> >> Pages can only be populated/destroyed on the RMM at the 4KB granule,
> >> this requires creating the full depth of RTTs. However if the pages are
> >> going to be combined into a 4MB huge page the last RTT is only
> >> temporarily needed. Similarly when freeing memory the huge page must be
> >> temporarily split requiring temporary usage of the full depth oF RTTs.
> >>
> >> To avoid needing to perform a temporary allocation and delegation of a
> >> page for this purpose we keep a spare delegated page around. In
> >> particular this avoids the need for memory allocation while destroying
> >> the realm guest.
> >>
> >> Signed-off-by: Steven Price <steven.price@xxxxxxx>
> >> ---
> >> arch/arm64/include/asm/kvm_rme.h | 3 +++
> >> arch/arm64/kvm/rme.c | 6 ++++++
> >> 2 files changed, 9 insertions(+)
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> >> index 055a22accc08..a6318af3ed11 100644
> >> --- a/arch/arm64/include/asm/kvm_rme.h
> >> +++ b/arch/arm64/include/asm/kvm_rme.h
> >> @@ -21,6 +21,9 @@ struct realm {
> >> void *rd;
> >> struct realm_params *params;
> >>
> >> + /* A spare already delegated page */
> >> + phys_addr_t spare_page;
> >> +
> >> unsigned long num_aux;
> >> unsigned int vmid;
> >> unsigned int ia_bits;
> >> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> >> index 9f8c5a91b8fc..0c9d70e4d9e6 100644
> >> --- a/arch/arm64/kvm/rme.c
> >> +++ b/arch/arm64/kvm/rme.c
> >> @@ -148,6 +148,7 @@ static int realm_create_rd(struct kvm *kvm)
> >> }
> >>
> >> realm->rd = rd;
> >> + realm->spare_page = PHYS_ADDR_MAX;
> >> realm->ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
> >>
> >> if (WARN_ON(rmi_rec_aux_count(rd_phys, &realm->num_aux))) {
> >> @@ -357,6 +358,11 @@ void kvm_destroy_realm(struct kvm *kvm)
> >> free_page((unsigned long)realm->rd);
> >> realm->rd = NULL;
> >> }
> >> + if (realm->spare_page != PHYS_ADDR_MAX) {
> >> + if (!WARN_ON(rmi_granule_undelegate(realm->spare_page)))
> >> + free_page((unsigned long)phys_to_virt(realm->spare_page));
> >
> > Will the page be leaked (not usable for host and realms) if the undelegate
> > failed? If yes, better at least put a comment.
>
> Yes - I'll add a comment.
>
> In general being unable to undelegate a page points to a programming
> error in the host. The only reason the RMM should refuse the request is
> it the page is in use by a Realm which the host has configured. So the
> WARN() is correct (there's a kernel bug) and the only sensible course of
> action is to leak the page and limp on.
>

It would be nice to add a summary of above into the patch comments.

Having a comment when leaking a page (which mostly means the page cannot be
reclaimed by VMM and used on a REALM any more) is nice. TDX/SNP also have
the problem of leaking pages due to mystic reasons.

Imagine the leaking can turn worse bit by bit in a long running server and
KVM will definitely have a generic accounting interface for reporting the
numbers to the userspace later. Having a explicit comment at this time
really makes it easier later.

> Thanks,
>
> Steve
>
> >> + realm->spare_page = PHYS_ADDR_MAX;
> >> + }
> >>
> >> pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
> >> for (i = 0; i < pgd_sz; i++) {
> >
>