Re: [PATCH Part2 v5 00/45] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support

From: Sean Christopherson
Date: Mon Nov 22 2021 - 16:34:18 EST

On Mon, Nov 22, 2021, Brijesh Singh wrote:
> On 11/22/21 1:14 PM, Dave Hansen wrote:
> > On 11/22/21 11:06 AM, Brijesh Singh wrote:
> > > > 3. Kernel accesses guest private memory via a kernel mapping. This one
> > > > is tricky. These probably *do* result in a panic() today, but
> > > > ideally shouldn't.
> > > KVM has defined some helper functions to maps and unmap the guest pages.
> > > Those helper functions do the GPA to PFN lookup before calling the
> > > kmap(). Those helpers are enhanced such that it check the RMP table
> > > before the kmap() and acquire a lock to prevent a page state change
> > > until the kunmap() is called. So, in the current implementation, we
> > > should *not* see a panic() unless there is a KVM driver bug that didn't
> > > use the helper functions or a bug in the helper function itself.
> >
> > I don't think this is really KVM specific.
> >
> > Think of a remote process doing ptrace(PTRACE_POKEUSER) or pretty much
> > any generic get_user_pages() instance. As long as the memory is mapped
> > into the page tables, you're exposed to users that walk the page tables.
> >
> > How do we, for example, prevent ptrace() from inducing a panic()?
> >
> In the current approach, this access will induce a panic(). In general,
> supporting the ptrace() for the encrypted VM region is going to be
> difficult.

But ptrace() is just an example, any path in the kernel that accesses a gup'd
page through a kernel mapping will explode if handed a guest private page.

> The upcoming TDX work to unmap the guest memory region from the current process
> page table can easily extend for the SNP to cover the current limitations.

That represents an ABI change though. If KVM allows userspace to create SNP guests
without any guarantees that userspace cannot coerce the kernel into accessing guest
private memory, then we are stuck supporting that behavior even if KVM later gains
the ability to provide such guarantees through new APIs.

If allowing this behavior was only a matter of the system admin opting into a
dangerous configuration, I would probably be ok merging SNP with it buried behind
EXPERT or something scarier, but this impacts KVM's ABI as well as kernel internals,
e.g. the hooks in kvm_vcpu_map() and friends are unnecessary if KVM can differentiate
between shared and private gfns in its memslots, as gfn_to_pfn() will either fail or
point at memory that is guaranteed to be in the shared state.