Re: [PATCH Part2 v5 39/45] KVM: SVM: Introduce ops for the post gfn map and unmap

From: Marc Orr
Date: Wed Sep 14 2022 - 12:33:19 EST


On Wed, Sep 14, 2022 at 5:15 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Wed, Sep 14, 2022, Marc Orr wrote:
> > On Wed, Sep 14, 2022 at 9:05 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> > >
> > > On Thu, Sep 08, 2022, Michael Roth wrote:
> > > > On Fri, Oct 15, 2021 at 05:16:28PM +0000, Sean Christopherson wrote:
> > > > So in the context of this interim solution, we're trying to look for a
> > > > solution that's simple enough that it can be used reliably, without
> > > > introducing too much additional complexity into KVM. There is one
> > > > approach that seems to fit that bill, that Brijesh attempted in an
> > > > earlier version of this series (I'm not sure what exactly was the
> > > > catalyst to changing the approach, as I wasn't really in the loop at
> > > > the time, but AIUI there weren't any showstoppers there, but please
> > > > correct me if I'm missing anything):
> > > >
> > > > - if the host is writing to a page that it thinks is supposed to be
> > > > shared, and the guest switches it to private, we get an RMP fault
> > > > (actually, we will get a !PRESENT fault, since as of v5 we now
> > > > remove the mapping from the directmap as part of conversion)
> > > > - in the host #PF handler, if we see that the page is marked private
> > > > in the RMP table, simply switch it back to shared
> > > > - if this was a bug on the part of the host, then the guest will see
> > >
> > > As discussed off-list, attempting to fix up RMP violations in the host #PF handler
> > > is not a viable approach. There was also extensive discussion on-list a while back:
> > >
> > > https://lore.kernel.org/all/8a244d34-2b10-4cf8-894a-1bf12b59cf92@xxxxxxxxxxxxxxxx
> >
> > I mentioned this during Mike's talk at the micro-conference: For pages
> > mapped in by the kernel can we disallow them to be converted to
> > private?
>
> In theory, yes. Do we want to do something like this? No. kmap() does something
> vaguely similar for 32-bit PAE/PSE kernels, but that's a lot of complexity and
> overhead to take on. And this issue goes far beyond a kmap(); when the kernel gup()s
> a page, the kernel expects the pfn to be available, no exceptions (pun intended).
>
> > Note, userspace accesses are already handled by UPM.
>
> I'm confused by the UPM comment. Isn't the gist of this thread about the ability
> to merge SNP _without_ UPM? Or am I out in left field?

I think that was the overall gist: yes. But it's not what I was trying
to comment on :-).

HOWEVER, thinking about this more: I was confused when I wrote out my
last reply. I had thought that the issue that Michael brought up
applied even with UPM. That is, I was thinking it was still possibly
for a guest to maliciously convert a page to private mapped in by the
kernel and assumed to be shared.

But I now realize that is not what will actually happen. To be
concrete, let's assume the GHCB page. What will happen is:
- KVM has GHCB page mapped in. GHCB is always assumed to be shared. So
far so good.
- Malicious guest converts GHCB page to private (e.g., via Page State
Change request)
- Guest exits to KVM
- KVM exits to userspace VMM
- Userspace VM allocates page in private FD.

Now, what happens here depends on how UPM works. If we allow double
allocation then our host kernel is safe. However, now we have the
"double allocation problem".

If on the other hand, we deallocate the page in the shared FD, the
host kernel can segfault. And now we actually do have essentially the
same problem Michael was describing that we have without UPM. Because
we'll end up in fault.c in the kernel context and likely panic the
host.

I hope I got this right this time. Sorry for the confusion on my last reply.

> > In pseudo-code, I'm thinking something like this:
> >
> > kmap_helper() {
> > // And all other interfaces where the kernel can map a GPA
> > // into the kernel page tables
> > mapped_into_kernel_mem_set[hpa] = true;
> > }
> >
> > kunmap_helper() {
> > // And all other interfaces where the kernel can unmap a GPA
> > // into the kernel page tables
> > mapped_into_kernel_mem_set[hpa] = false;
> >
> > // Except it's not this simple because we probably need ref counting
> > // for multiple mappings. Sigh. But you get the idea.
>
> A few issues off the top of my head:
>
> - It's not just refcounting, there would also likely need to be locking to
> guarantee sane behavior.
> - kmap() isn't allowed to fail and RMPUPDATE isn't strictly guaranteed to succeed,
> which is problematic if the kernel attempts to kmap() a page that's already
> private, especially for kmap_atomic(), which isn't allowed to sleep.
> - Not all kernel code is well behaved and bounces through kmap(); undoubtedly
> some of the 1200+ users of page_address() will be problematic.
>
> $ git grep page_address | wc -l
> 1267
> - It's not sufficient for TDX. Merging something this complicated when we know
> we still need UPM would be irresponsible from a maintenance perspective.
> - KVM would need to support two separate APIs for SNP, which I very much don't
> want to do.

Ack on merging without UPM. I wasn't trying to chime in on merging
before/after UPM. See my other comment above. Sorry for the confusion.
Ack on other concerns about "enlightening kmap" as well. I agree.