Re: [PATCH v1 2/4] selftests: kvm: Add processor tests for LoongArch KVM

From: Sean Christopherson
Date: Wed Aug 02 2023 - 14:07:49 EST


On Tue, Aug 01, 2023, Tianrui Zhao wrote:
> Add processor tests for LoongArch KVM, including vcpu initialize

Nit, AFAICT these aren't tests, this is simply the core KVM selftests support
for LoongArch.

> and tlb refill exception handler.
>
> Based-on: <20230720062813.4126751-1-zhaotianrui@xxxxxxxxxxx>
> Signed-off-by: Tianrui Zhao <zhaotianrui@xxxxxxxxxxx>
> ---
> .../selftests/kvm/lib/loongarch/exception.S | 27 ++
> .../selftests/kvm/lib/loongarch/processor.c | 367 ++++++++++++++++++
> 2 files changed, 394 insertions(+)
> create mode 100644 tools/testing/selftests/kvm/lib/loongarch/exception.S
> create mode 100644 tools/testing/selftests/kvm/lib/loongarch/processor.c
>
> diff --git a/tools/testing/selftests/kvm/lib/loongarch/exception.S b/tools/testing/selftests/kvm/lib/loongarch/exception.S
> new file mode 100644
> index 000000000000..19dc50993da4
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/lib/loongarch/exception.S
> @@ -0,0 +1,27 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#include "sysreg.h"
> +
> +/* address of refill exception should be 4K aligned */
> +.align 12

.align works on bytes, not on shifts. I.e. this will make handle_tlb_refill
12-byte aligned, not 4096-byte aligned.

> +.global handle_tlb_refill
> +handle_tlb_refill:
> + csrwr t0, LOONGARCH_CSR_TLBRSAVE
> + csrrd t0, LOONGARCH_CSR_PGD
> + lddir t0, t0, 3
> + lddir t0, t0, 1
> + ldpte t0, 0
> + ldpte t0, 1
> + tlbfill
> + csrrd t0, LOONGARCH_CSR_TLBRSAVE
> + ertn
> +
> +/* address of general exception should be 4K aligned */
> +.align 12

Same thing here.

> +.global handle_exception
> +handle_exception:
> +1:
> + nop
> + b 1b
> + nop
> + ertn
> diff --git a/tools/testing/selftests/kvm/lib/loongarch/processor.c b/tools/testing/selftests/kvm/lib/loongarch/processor.c
> new file mode 100644
> index 000000000000..2e50b6e2c556
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/lib/loongarch/processor.c
> @@ -0,0 +1,367 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * KVM selftest LoongArch library code, including CPU-related functions.
> + *

Again, unnecessary IMO. If you do keep the comment, the extra line with a bare
asterisk should be dropped.

> + */
> +
> +#include <assert.h>
> +#include <linux/bitfield.h>
> +#include <linux/compiler.h>
> +
> +#include "kvm_util.h"
> +#include "processor.h"
> +#include "sysreg.h"
> +
> +#define DEFAULT_LOONGARCH_GUEST_STACK_VADDR_MIN 0xac0000

Why diverge from the common?

#define DEFAULT_GUEST_STACK_VADDR_MIN 0xab6000

AFAIK, the common value is also mostly arbitrary, but that just makes it even
more confusing as to why LoongArch needs to bump the min by 0x4000.

> +uint64_t *virt_get_pte_hva(struct kvm_vm *vm, vm_vaddr_t gva)
> +{
> + uint64_t *ptep;
> +
> + if (!vm->pgd_created)
> + goto unmapped_gva;
> +
> + ptep = addr_gpa2hva(vm, vm->pgd) + pgd_index(vm, gva) * 8;
> + if (!ptep)
> + goto unmapped_gva;
> +
> + switch (vm->pgtable_levels) {
> + case 4:
> + ptep = addr_gpa2hva(vm, pte_addr(vm, *ptep)) + pud_index(vm, gva) * 8;
> + if (!ptep)
> + goto unmapped_gva;

This wants a "fallthrough" annotation.

> + case 3:
> + ptep = addr_gpa2hva(vm, pte_addr(vm, *ptep)) + pmd_index(vm, gva) * 8;
> + if (!ptep)
> + goto unmapped_gva;
> + case 2:
> + ptep = addr_gpa2hva(vm, pte_addr(vm, *ptep)) + pte_index(vm, gva) * 8;
> + if (!ptep)
> + goto unmapped_gva;
> + break;
> + default:
> + TEST_FAIL("Page table levels must be 2, 3, or 4");

Obviously it shouldn't come up, but print the actual pgtable_levels to make debug
a wee bit easier e.g.
TEST_FAIL("Got %u page table levels, expected 2, 3, or 4",
vm->pgtable_levels);

Mostly out of curiosity, but also because it looks like this was heavily copy+pasted
from ARM: does LoongArch actually support 2-level page tables?
> +static void loongarch_set_csr(struct kvm_vcpu *vcpu, uint64_t id, uint64_t val)
> +{
> + uint64_t csrid;
> +
> + csrid = KVM_REG_LOONGARCH_CSR | KVM_REG_SIZE_U64 | 8 * id;
> + vcpu_set_reg(vcpu, csrid, val);
> +}
> +
> +static void loongarch_vcpu_setup(struct kvm_vcpu *vcpu)
> +{
> + unsigned long val;
> + int width;
> + struct kvm_vm *vm = vcpu->vm;
> +
> + switch (vm->mode) {
> + case VM_MODE_P48V48_16K:
> + case VM_MODE_P40V48_16K:
> + case VM_MODE_P36V48_16K:
> + case VM_MODE_P36V47_16K:
> + break;
> +
> + default:
> + TEST_FAIL("Unknown guest mode, mode: 0x%x", vm->mode);
> + }
> +
> + /* user mode and page enable mode */
> + val = PLV_USER | CSR_CRMD_PG;
> + loongarch_set_csr(vcpu, LOONGARCH_CSR_CRMD, val);
> + loongarch_set_csr(vcpu, LOONGARCH_CSR_PRMD, val);
> + loongarch_set_csr(vcpu, LOONGARCH_CSR_EUEN, 1);
> + loongarch_set_csr(vcpu, LOONGARCH_CSR_ECFG, 0);
> + loongarch_set_csr(vcpu, LOONGARCH_CSR_TCFG, 0);
> + loongarch_set_csr(vcpu, LOONGARCH_CSR_ASID, 1);
> +
> + width = vm->page_shift - 3;
> + val = 0;
> + switch (vm->pgtable_levels) {
> + case 4:
> + /* pud page shift and width */
> + val = (vm->page_shift + width * 2) << 20 | (width << 25);
> + case 3:
> + /* pmd page shift and width */
> + val |= (vm->page_shift + width) << 10 | (width << 15);
> + case 2:
> + /* pte page shift and width */
> + val |= vm->page_shift | width << 5;
> + break;
> + default:
> + TEST_FAIL("Page table levels must be 2, 3, or 4");
> + }
> + loongarch_set_csr(vcpu, LOONGARCH_CSR_PWCTL0, val);
> +
> + /* pgd page shift and width */
> + val = (vm->page_shift + width * (vm->pgtable_levels - 1)) | width << 6;
> + loongarch_set_csr(vcpu, LOONGARCH_CSR_PWCTL1, val);
> +
> + loongarch_set_csr(vcpu, LOONGARCH_CSR_PGDL, vm->pgd);
> +
> + extern void handle_tlb_refill(void);
> + extern void handle_exception(void);

Eww. I get that it's probably undesirable to expose these via processor.h, but
at least declare them outside of the function.

> +struct kvm_vcpu *vm_arch_vcpu_add(struct kvm_vm *vm, uint32_t vcpu_id,
> + void *guest_code)
> +{
> + return loongarch_vcpu_add(vm, vcpu_id, guest_code);

Please drop the single-line passthrough, i.e. drop loongarch_vcpu_add(). I'm
guessing you copy+pasted much of this from ARM. ARM's passthrough isn't a pure
passthrough, which is directly related to why the "passthrough" is ok: there are
other callers to aarch64_vcpu_add() that pass a non-NULL @source.