Re: [PATCH v2 3/4] KVM: arm64: sefltests: Add basic NV selftest

From: Wei-Lin Chang

Date: Thu Apr 16 2026 - 17:59:00 EST

On Wed, Apr 15, 2026 at 07:05:17AM +0900, Itaru Kitayama wrote:
> On Tue, Apr 14, 2026 at 11:16:47AM +0100, Wei-Lin Chang wrote:
> > On Tue, Apr 14, 2026 at 06:31:22AM +0900, Itaru Kitayama wrote:
> > > On Mon, Apr 13, 2026 at 10:18:42AM +0100, Wei-Lin Chang wrote:
> > > > Hi Itaru,
> > > >
> > > > On Mon, Apr 13, 2026 at 08:19:25AM +0900, Itaru Kitayama wrote:
> > > > > On Sun, Apr 12, 2026 at 03:22:15PM +0100, Wei-Lin Chang wrote:
> > > > > > This selftest simply starts an L1, which starts its own guest (L2). L2
> > > > > > runs without stage-1 and 2 translations, it calls an HVC to jump back
> > > > > > to L1.
> > > > >
> > > > > How do you disable both the nested guest (L2)'s MMU and stage 2
> > > > > translations?
> > > >
> > > > Guest stage-2 is disabled by not setting HCR_EL2.VM in prepare_hyp(),
> > > > and stage-1 is disabled by not writing to SCTLR_EL12 in init_vcpu(),
> > > > effectively using the default value set by L0. However since SCTLR_EL1
> > > > has many architecturally UNKNOWN bits (including SCTLR_EL1.M), it should
> > > > be better to write a value before running L2 I suppose...
> > >
> > > Thanks. What do you think of using copy_el2_to_el1() macro in at.c, so we
> > > can prepare in guest_code() to manipulate the SCTLR_EL12 System register
> > > with the sensible programmed values?
> >
> > Yes, using copy_el2_to_el1() can give us an L2 stage-1 that is identical
> > to the L1's stage-1. But what I was considering was if guest stage-2 is
> > enabled (which we plan to implement), then those stage-1 page tables
> > will have to be mapped for L2, and its base address translated to L2IPA.
> > It's doable but seems like extra complexity when stage-1 is not so
> > interesting for KVM (except for AT?), it lets the guest do whatever it
> > likes and let the hardware do the translation.
> >
> > Let me know if you have reasons to want stage-1 for L2, there could be
> > something I should consider but did not.
>
> By keeping nested guest's MMU enabled, we can exercise the shadow stage
> 2 on the host. But I am fine with you starting nested guest's IPA and I
> hope Marc and Oliver approve this seris and merge upstream.

I think you have guest stage-1 and guest stage-2 confused. Whether the
nested guest's stage-1 MMU is enabled or not does not affect what KVM is
doing with the shadow page tables. Stage-1 MMU translates L2VA -> L2IPA.
Shadow page tables store the combined translation of L2IPA -> L1IPA
(stage-2 PTs L1 built for L2) and L1IPA -> host PA (stage-2 PTs host
built for L1).

Additionally, stage-2 not enabled for L2 does not mean shadow stage-2 is
not exercised, there is still a distince shadow stage-2 for it doing the
work, albeit simple (the stored mapping is the same as the canonical
stage-2).

All in all, if we want to make the shadow page tables more interesting,
what we should do is build a stage-2 for L2, and enable it in L1, not
just turn on L2's stage-1 MMU.

Thanks,
Wei-Lin Chang

>
> Thanks,
> Itaru.
>
> >
> > Thanks,
> > Wei-Lin Chang
> >
> > >
> > > Itaru.
> > >
> > > >
> > > > Thanks,
> > > > Wei-Lin Chang
> > > >
> > > > >
> > > > > Itaru.
> > > > >
> > > > > >
> > > > > > Signed-off-by: Wei-Lin Chang <weilin.chang@xxxxxxx>
> > > > > > ---
> > > > > > tools/testing/selftests/kvm/Makefile.kvm | 1 +
> > > > > > .../selftests/kvm/arm64/hello_nested.c | 103 ++++++++++++++++++
> > > > > > 2 files changed, 104 insertions(+)
> > > > > > create mode 100644 tools/testing/selftests/kvm/arm64/hello_nested.c
> > > > > >
> > > > > > diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm
> > > > > > index 3dc3e39f7025..e8c108e0c487 100644
> > > > > > --- a/tools/testing/selftests/kvm/Makefile.kvm
> > > > > > +++ b/tools/testing/selftests/kvm/Makefile.kvm
> > > > > > @@ -168,6 +168,7 @@ TEST_GEN_PROGS_arm64 += arm64/arch_timer_edge_cases
> > > > > > TEST_GEN_PROGS_arm64 += arm64/at
> > > > > > TEST_GEN_PROGS_arm64 += arm64/debug-exceptions
> > > > > > TEST_GEN_PROGS_arm64 += arm64/hello_el2
> > > > > > +TEST_GEN_PROGS_arm64 += arm64/hello_nested
> > > > > > TEST_GEN_PROGS_arm64 += arm64/host_sve
> > > > > > TEST_GEN_PROGS_arm64 += arm64/hypercalls
> > > > > > TEST_GEN_PROGS_arm64 += arm64/external_aborts
> > > > > > diff --git a/tools/testing/selftests/kvm/arm64/hello_nested.c b/tools/testing/selftests/kvm/arm64/hello_nested.c
> > > > > > new file mode 100644
> > > > > > index 000000000000..97387e4697b3
> > > > > > --- /dev/null
> > > > > > +++ b/tools/testing/selftests/kvm/arm64/hello_nested.c
> > > > > > @@ -0,0 +1,103 @@
> > > > > > +// SPDX-License-Identifier: GPL-2.0-only
> > > > > > +/*
> > > > > > + * hello_nested - Go from vEL2 to EL1 then back
> > > > > > + */
> > > > > > +
> > > > > > +#include "nested.h"
> > > > > > +#include "processor.h"
> > > > > > +#include "test_util.h"
> > > > > > +#include "ucall.h"
> > > > > > +
> > > > > > +#define XLATE2GPA (0xABCD)
> > > > > > +#define L2STACKSZ (0x100)
> > > > > > +
> > > > > > +/*
> > > > > > + * TPIDR_EL2 is used to store vcpu id, so save and restore it.
> > > > > > + */
> > > > > > +static vm_paddr_t ucall_translate_to_gpa(void *gva)
> > > > > > +{
> > > > > > + vm_paddr_t gpa;
> > > > > > + u64 vcpu_id = read_sysreg(tpidr_el2);
> > > > > > +
> > > > > > + GUEST_SYNC2(XLATE2GPA, gva);
> > > > > > +
> > > > > > + /* get the result from userspace */
> > > > > > + gpa = read_sysreg(tpidr_el2);
> > > > > > +
> > > > > > + write_sysreg(vcpu_id, tpidr_el2);
> > > > > > +
> > > > > > + return gpa;
> > > > > > +}
> > > > > > +
> > > > > > +static void l2_guest_code(void)
> > > > > > +{
> > > > > > + do_hvc();
> > > > > > +}
> > > > > > +
> > > > > > +static void guest_code(void)
> > > > > > +{
> > > > > > + struct vcpu vcpu;
> > > > > > + struct hyp_data hyp_data;
> > > > > > + int ret;
> > > > > > + vm_paddr_t l2_pc, l2_stack_top;
> > > > > > + /* force 16-byte alignment for the stack pointer */
> > > > > > + u8 l2_stack[L2STACKSZ] __attribute__((aligned(16)));
> > > > > > +
> > > > > > + GUEST_ASSERT_EQ(get_current_el(), 2);
> > > > > > + GUEST_PRINTF("vEL2 entry\n");
> > > > > > +
> > > > > > + l2_pc = ucall_translate_to_gpa(l2_guest_code);
> > > > > > + l2_stack_top = ucall_translate_to_gpa(&l2_stack[L2STACKSZ]);
> > > > > > +
> > > > > > + init_vcpu(&vcpu, l2_pc, l2_stack_top);
> > > > > > + prepare_hyp();
> > > > > > +
> > > > > > + ret = run_l2(&vcpu, &hyp_data);
> > > > > > + GUEST_ASSERT_EQ(ret, ARM_EXCEPTION_TRAP);
> > > > > > + GUEST_DONE();
> > > > > > +}
> > > > > > +
> > > > > > +int main(void)
> > > > > > +{
> > > > > > + struct kvm_vcpu_init init;
> > > > > > + struct kvm_vcpu *vcpu;
> > > > > > + struct kvm_vm *vm;
> > > > > > + struct ucall uc;
> > > > > > + vm_paddr_t gpa;
> > > > > > +
> > > > > > + TEST_REQUIRE(kvm_check_cap(KVM_CAP_ARM_EL2));
> > > > > > + vm = vm_create(1);
> > > > > > +
> > > > > > + kvm_get_default_vcpu_target(vm, &init);
> > > > > > + init.features[0] |= BIT(KVM_ARM_VCPU_HAS_EL2);
> > > > > > + vcpu = aarch64_vcpu_add(vm, 0, &init, guest_code);
> > > > > > + kvm_arch_vm_finalize_vcpus(vm);
> > > > > > +
> > > > > > + while (true) {
> > > > > > + vcpu_run(vcpu);
> > > > > > +
> > > > > > + switch (get_ucall(vcpu, &uc)) {
> > > > > > + case UCALL_SYNC:
> > > > > > + if (uc.args[0] == XLATE2GPA) {
> > > > > > + gpa = addr_gva2gpa(vm, (vm_vaddr_t)uc.args[1]);
> > > > > > + vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_TPIDR_EL2), gpa);
> > > > > > + }
> > > > > > + break;
> > > > > > + case UCALL_PRINTF:
> > > > > > + pr_info("%s", uc.buffer);
> > > > > > + break;
> > > > > > + case UCALL_DONE:
> > > > > > + pr_info("DONE!\n");
> > > > > > + goto end;
> > > > > > + case UCALL_ABORT:
> > > > > > + REPORT_GUEST_ASSERT(uc);
> > > > > > + fallthrough;
> > > > > > + default:
> > > > > > + TEST_FAIL("Unhandled ucall: %ld\n", uc.cmd);
> > > > > > + }
> > > > > > + }
> > > > > > +
> > > > > > +end:
> > > > > > + kvm_vm_free(vm);
> > > > > > + return 0;
> > > > > > +}
> > > > > > --
> > > > > > 2.43.0
> > > > > >