On Tue, Aug 23, 2022, Chang S. Bae wrote:
== Background ==
A set of architecture-specific prctl() options offer to control dynamic
XSTATE components in VCPUs. Userspace VMMs may interact with the host using
ARCH_GET_XCOMP_GUEST_PERM and ARCH_REQ_XCOMP_GUEST_PERM.
However, they are separated from the KVM API. KVM may select features that
the host supports and advertise them through the KVM_X86_XCOMP_GUEST_SUPP
attribute.
== Problem ==
QEMU [1] queries the features through the KVM API instead of using the x86
arch_prctl() option. But it still needs to use arch_prctl() to request the
permission. Then this step may become fragile because it does not guarantee
to comply with the KVM policy.
But backdooring through KVM doesn't prevent usersepace from walking in through
the front door (arch_prctl()), i.e. this doesn't protect the kernel in any way.
KVM needs to ensure that _KVM_ doesn't screw up and let userspace use features
that KVM doesn't support. The kernel's restrictions on using features goes on
top, i.e. KVM must behave correctly irrespective of kernel restrictions.
If QEMU wants to assert that it didn't misconfigure itself, it can assert on the
config in any number of ways, e.g. assert that ARCH_GET_XCOMP_GUEST_PERM is a
subset of KVM_X86_XCOMP_GUEST_SUPP at the end of kvm_request_xsave_components().