Re: [PATCH Part2 v5 00/45] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support

From: Joerg Roedel
Date: Tue Nov 16 2021 - 08:22:02 EST


On Mon, Nov 15, 2021 at 09:14:14PM -0800, Andy Lutomirski wrote:
> It’s time to put on my maintainer hat. This is solidly in my
> territory, and NAK. A kernel privilege fault, from who-knows-what
> context (interrupts off? NMI? locks held?) that gets an RMP violation
> with no exception handler is *not* going to blindly write the RMP and
> retry. It’s not going to send flush IPIs or call into KVM to “fix”
> things. Just the locking issues alone are probably showstopping, even
> ignoring the security and sanity issues.

RMP faults are expected from two contexts:

* User-space
* KVM running in task context

The only situation where RMP faults could happen outside of these
contexts is when running a kexec'ed kernel, which was launched while SNP
guests were still running (that needs to be taken care of as well).

And from the locking side, which lock does the #PF handler need to take?
Processors supporting SNP also have hardware support for flushing remote
TLBs, so locks taken in the flush path are not strictly required.

Calling into KVM is another story and needs some more thought, I agree
with that.

> Otherwise can we please get on with designing a reasonable model for
> guest-private memory please?

It is fine to unmap guest-private memory from the host kernel, even if
it is not required by SNP. TDX need to do that because of the #MC thing
that happens otherwise, but that is also just a way to emulate an
RMP-like fault with TDX.

But as Marc already pointed out, the kernel needs a plan B when an RMP
happens anyway due to some bug.

Regards,

--
Jörg Rödel
jroedel@xxxxxxx

SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany

(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev