Re: [RFC PATCH 00/26] kvm: arm64: Always-on nVHE hypervisor

From: Marc Zyngier
Date: Fri Nov 06 2020 - 07:25:10 EST


On 2020-11-04 18:36, David Brazdil wrote:
As we progress towards being able to keep guest state private to the
host running nVHE hypervisor, this series allows the hypervisor to
install itself on newly booted CPUs before the host is allowed to run
on them.

To this end, the hypervisor starts trapping host SMCs and intercepting
host's PSCI CPU_ON/OFF/SUSPEND calls. It replaces the host's entry point
with its own, initializes the EL2 state of the new CPU and installs
the nVHE hyp vector before ERETing to the host's entry point.

Other PSCI SMCs are forwarded to EL3, though only the known set of SMCs
implemented in the kernel is allowed. Non-PSCI SMCs are also forwarded
to EL3. Future changes will need to ensure the safety of all SMCs wrt.
private guests.

The host is still allowed to reset EL2 back to the stub vector, eg. for
hibernation or kexec, but will not disable nVHE when there are no VMs.

Tested on Rock Pi 4b.


Sending this as an RFC to get feedback on the following decisions:

1) The kernel checks new cores' features against the finalized system
capabilities. To avoid the need to move this code/data to EL2, the
implementation only allows to boot cores that were online at the time of
KVM initialization.

2) Trapping and forwarding SMCs cannot be switched off. This could cause
issues eg. if EL3 always returned to EL1. A kernel command line flag may
be needed to turn the feature off on such platforms.

I'd rather have it the other way around (buy-in rather than turn off).
On top of the potential issue with stupid EL3s, there is the issue that
PSCI is optional, and that protected VMs won't be able to work without
it. Another related thing is that EL3 itself is optional.

Note that this flag shouldn't be specific to PSCI proxying. It should also
control Stage-2 wrapping, and the whole pKVM.

Thanks,

M.
--
Jazz is not dead. It just smells funny...