On 5/5/2023 8:20 AM, Mickaël Salaün wrote:
Hi,
This patch series is a proof-of-concept that implements new KVM features
(extended page tracking, MBEC support, CR pinning) and defines a new API to
protect guest VMs. No VMM (e.g., Qemu) modification is required.
The main idea being that kernel self-protection mechanisms should be delegated
to a more privileged part of the system, hence the hypervisor. It is still the
role of the guest kernel to request such restrictions according to its
Only for the guest kernel images here? Why not for the host OS kernel?
Embedded devices w/ Android you have mentioned below supports the host
OS as well it seems, right?
Do we suggest that all the functionalities should be implemented in the
Hypervisor (NS-EL2 for ARM) or even at Secure EL like Secure-EL1 (ARM).
I am hoping that whatever we suggest the interface here from the Guest
to the Hypervisor becomes the ABI right?
# Current limitations
The main limitation of this patch series is the statically enforced
permissions. This is not an issue for kernels without module but this needs to
be addressed. Mechanisms that dynamically impact kernel executable memory are
not handled for now (e.g., kernel modules, tracepoints, eBPF JIT), and such
code will need to be authenticated. Because the hypervisor is highly
privileged and critical to the security of all the VMs, we don't want to
implement a code authentication mechanism in the hypervisor itself but delegate
this verification to something much less privileged. We are thinking of two
ways to solve this: implement this verification in the VMM or spawn a dedicated
special VM (similar to Windows's VBS). There are pros on cons to each approach:
complexity, verification code ownership (guest's or VMM's), access to guest
memory (i.e., confidential computing).
Do you foresee the performance regressions due to lot of tracking here?
Production kernels do have lot of tracepoints and we use it as feature
in the GKI kernel for the vendor hooks implementation and in those cases
every vendor driver is a module.
Separate VM further fragments this
design and delegates more of it to proprietary solutions?
Do you have any performance numbers w/ current RFC?