[PATCH 5.10 120/167] KVM: PPC: Book3S HV Nested: Sanitise H_ENTER_NESTED TM state

From: Greg Kroah-Hartman
Date: Mon Jul 26 2021 - 12:23:55 EST


From: Nicholas Piggin <npiggin@xxxxxxxxx>

commit d9c57d3ed52a92536f5fa59dc5ccdd58b4875076 upstream.

The H_ENTER_NESTED hypercall is handled by the L0, and it is a request
by the L1 to switch the context of the vCPU over to that of its L2
guest, and return with an interrupt indication. The L1 is responsible
for switching some registers to guest context, and the L0 switches
others (including all the hypervisor privileged state).

If the L2 MSR has TM active, then the L1 is responsible for
recheckpointing the L2 TM state. Then the L1 exits to L0 via the
H_ENTER_NESTED hcall, and the L0 saves the TM state as part of the exit,
and then it recheckpoints the TM state as part of the nested entry and
finally HRFIDs into the L2 with TM active MSR. Not efficient, but about
the simplest approach for something that's horrendously complicated.

Problems arise if the L1 exits to the L0 with a TM state which does not
match the L2 TM state being requested. For example if the L1 is
transactional but the L2 MSR is non-transactional, or vice versa. The
L0's HRFID can take a TM Bad Thing interrupt and crash.

Fix this by disallowing H_ENTER_NESTED in TM[T] state entirely, and then
ensuring that if the L1 is suspended then the L2 must have TM active,
and if the L1 is not suspended then the L2 must not have TM active.

Fixes: 360cae313702 ("KVM: PPC: Book3S HV: Nested guest entry via hypercall")
Cc: stable@xxxxxxxxxxxxxxx # v4.20+
Reported-by: Alexey Kardashevskiy <aik@xxxxxxxxx>
Acked-by: Michael Neuling <mikey@xxxxxxxxxxx>
Signed-off-by: Nicholas Piggin <npiggin@xxxxxxxxx>
Signed-off-by: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
---
arch/powerpc/kvm/book3s_hv_nested.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -232,6 +232,9 @@ long kvmhv_enter_nested_guest(struct kvm
if (vcpu->kvm->arch.l1_ptcr == 0)
return H_NOT_AVAILABLE;

+ if (MSR_TM_TRANSACTIONAL(vcpu->arch.shregs.msr))
+ return H_BAD_MODE;
+
/* copy parameters in */
hv_ptr = kvmppc_get_gpr(vcpu, 4);
regs_ptr = kvmppc_get_gpr(vcpu, 5);
@@ -254,6 +257,23 @@ long kvmhv_enter_nested_guest(struct kvm
if (l2_hv.vcpu_token >= NR_CPUS)
return H_PARAMETER;

+ /*
+ * L1 must have set up a suspended state to enter the L2 in a
+ * transactional state, and only in that case. These have to be
+ * filtered out here to prevent causing a TM Bad Thing in the
+ * host HRFID. We could synthesize a TM Bad Thing back to the L1
+ * here but there doesn't seem like much point.
+ */
+ if (MSR_TM_SUSPENDED(vcpu->arch.shregs.msr)) {
+ if (!MSR_TM_ACTIVE(l2_regs.msr))
+ return H_BAD_MODE;
+ } else {
+ if (l2_regs.msr & MSR_TS_MASK)
+ return H_BAD_MODE;
+ if (WARN_ON_ONCE(vcpu->arch.shregs.msr & MSR_TS_MASK))
+ return H_BAD_MODE;
+ }
+
/* translate lpid */
l2 = kvmhv_get_nested(vcpu->kvm, l2_hv.lpid, true);
if (!l2)