Re: [PATCH v2 1/9] KVM: x86: Add AMD SEV specific Hypercall3
From: Steve Rutherford
Date: Thu Jan 07 2021 - 19:56:09 EST
Supporting merging of consecutive entries (or not) is less important
to get right since it doesn't change any of the APIs. If someone runs
into performance issues, they can loop back and fix this then. I'm
slightly concerned with the behavior for overlapping regions. I also
have slight concerns with how we handle re-encrypting small chunks of
larger unencrypted regions. I don't think we've seen these in
practice, but nothing precludes them afaik.
On Thu, Jan 7, 2021 at 11:23 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Thu, Jan 07, 2021, Ashish Kalra wrote:
> > On Thu, Jan 07, 2021 at 09:26:25AM -0800, Sean Christopherson wrote:
> > > On Thu, Jan 07, 2021, Ashish Kalra wrote:
> > > > Hello Steve,
> > > >
> > > > On Wed, Jan 06, 2021 at 05:01:33PM -0800, Steve Rutherford wrote:
> > > > > Avoiding an rbtree for such a small (but unstable) list seems correct.
> > > > >
> > > > > For the unencrypted region list strategy, the only questions that I
> > > > > have are fairly secondary.
> > > > > - How should the kernel upper bound the size of the list in the face
> > > > > of malicious guests, but still support large guests? (Something
> > > > > similar to the size provided in the bitmap API would work).
> > > >
> > > > I am thinking of another scenario, where a malicious guest can make
> > > > infinite/repetetive hypercalls and DOS attack the host.
> > > >
> > > > But probably this is a more generic issue, this can be done by any guest
> > > > and under any hypervisor, i don't know what kind of mitigations exist
> > > > for such a scenario ?
> > > >
> > > > Potentially, the guest memory donation model can handle such an attack,
> > > > because in this model, the hypervisor will expect only one hypercall,
> > > > any repetetive hypercalls can make the hypervisor disable the guest ?
> > >
> > > KVM doesn't need to explicitly bound its tracking structures, it just needs to
> > > use GFP_KERNEL_ACCOUNT when allocating kernel memory for the structures so that
> > > the memory will be accounted to the task/process/VM. Shadow MMU pages are the
> > > only exception that comes to mind; they're still accounted properly, but KVM
> > > also explicitly limits them for a variety of reasons.
> > >
> > > The size of the list will naturally be bounded by the size of the guest; and
> > > assuming KVM proactively merges adjancent regions, that upper bound is probably
> > > reasonably low compared to other allocations, e.g. the aforementioned MMU pages.
> > >
> > > And, using a list means a malicious guest will get automatically throttled as
> > > the latency of walking the list (to merge/delete existing entries) will increase
> > > with the size of the list.
> >
> > Just to add here, potentially there won't be any proactive
> > merging/deletion of existing entries, as the only static entries will be
> > initial guest MMIO regions, which are contigious guest PA ranges but not
> > necessarily adjacent.
>
> My point was that, if the guest is malicious, eventually there will be adjacent
> entries, e.g. the worst case scenario is that the encrypted status changes on
> every 4k page. Anyways, not really all that important, I mostly thinking out
> loud :-)
Agreed. Tagging this with GFP_KERNEL_ACCOUNT means we don't need to
upper bound the number of pages. I now don't think there is any
unusual DoS potential here. Perhaps, if the guest tries really hard to
make a massive list, they could get a softlockup on the host. Not sure
how important that is to fix.