Re: [RESEND RFC 0/2] Paravirtualized Control Register pinning

From: Andersen, John S
Date: Mon Dec 23 2019 - 12:28:25 EST


On Sat, 2019-12-21 at 14:59 +0100, Paolo Bonzini wrote:
> On 20/12/19 20:26, John Andersen wrote:
> > Paravirtualized CR pinning will likely be incompatible with kexec
> > for
> > the foreseeable future. Early boot code could possibly be changed
> > to
> > not clear protected bits. However, a kernel that requests CR bits
> > be
> > pinned can't know if the kernel it's kexecing has been updated to
> > not
> > clear protected bits. This would result in the kernel being kexec'd
> > almost immediately receiving a general protection fault.
> >
> > Security conscious kernel configurations disable kexec already, per
> > KSPP
> > guidelines. Projects such as Kata Containers, AWS Lambda, ChromeOS
> > Termina, and others using KVM to virtualize Linux will benefit from
> > this protection.
> >
> > The usage of SMM in SeaBIOS was explored as a way to communicate to
> > KVM
> > that a reboot has occurred and it should zero the pinned bits. When
> > using QEMU and SeaBIOS, SMM initialization occurs on reboot.
> > However,
> > prior to SMM initialization, BIOS writes zero values to CR0,
> > causing a
> > general protection fault to be sent to the guest before SMM can
> > signal
> > that the machine has booted.
>
> SMM is optional; I think it makes sense to leave it to userspace to
> reset pinning (including for the case of triple faults), while INIT
> which is handled within KVM would keep it active.
>
> > Pinning of sensitive CR bits has already been implemented to
> > protect
> > against exploits directly calling native_write_cr*(). The current
> > protection cannot stop ROP attacks which jump directly to a MOV CR
> > instruction. Guests running with paravirtualized CR pinning are now
> > protected against the use of ROP to disable CR bits. The same bits
> > that
> > are being pinned natively may be pinned via the CR pinned MSRs.
> > These
> > bits are WP in CR0, and SMEP, SMAP, and UMIP in CR4.
> >
> > Future patches could protect bits in MSRs in a similar fashion. The
> > NXE
> > bit of the EFER MSR is a prime candidate.
>
> Please include patches for either kvm-unit-tests or
> tools/testing/selftests/kvm that test the functionality.
>

Will do