Re: [RFC PATCH 6/8] KVM: x86: Implement kvm_arch_{, pre_}vcpu_map_memory()

From: Isaku Yamahata
Date: Tue Mar 19 2024 - 12:29:20 EST


On Wed, Mar 06, 2024 at 05:51:51PM -0800,
Isaku Yamahata <isaku.yamahata@xxxxxxxxxxxxxxx> wrote:

> On Wed, Mar 06, 2024 at 04:36:25PM -0800,
> David Matlack <dmatlack@xxxxxxxxxx> wrote:
>
> > On Wed, Mar 6, 2024 at 4:31 PM David Matlack <dmatlack@xxxxxxxxxx> wrote:
> > >
> > > On 2024-03-01 09:28 AM, isaku.yamahata@xxxxxxxxx wrote:
> > > >
> > > > + if (IS_ALIGNED(mapping->base_gfn, KVM_PAGES_PER_HPAGE(PG_LEVEL_1G)) &&
> > > > + mapping->nr_pages >= KVM_PAGES_PER_HPAGE(PG_LEVEL_1G))
> > > > + max_level = PG_LEVEL_1G;
> > > > + else if (IS_ALIGNED(mapping->base_gfn, KVM_PAGES_PER_HPAGE(PG_LEVEL_2M)) &&
> > > > + mapping->nr_pages >= KVM_PAGES_PER_HPAGE(PG_LEVEL_2M))
> > > > + max_level = PG_LEVEL_2M;
> > > > + else
> > > > + max_level = PG_LEVEL_4K;
> > >
> > > Is there a requirement that KVM must not map memory outside of the
> > > requested region?
> >
> > And if so, what if the requested region is already mapped with a larger page?
>
> Yes. We'd like to map exact gpa range for SNP or TDX case. We don't want to map
> zero at around range. For SNP or TDX, we map page to GPA, it's one time
> operation. It updates measurement.
>
> Say, we'd like to populate GPA1 and GPA2 with initial guest memory image. And
> they are within same 2M range. Map GPA1 first. If GPA2 is also mapped with zero
> with 2M page, the following mapping of GPA2 fails. Even if mapping of GPA2
> succeeds, measurement may be updated when mapping GPA1.
>
> It's user space VMM responsibility to map GPA range only once at most for SNP or
> TDX. Is this too strict requirement for default VM use case to mitigate KVM
> page fault at guest boot up? If so, what about a flag like EXACT_MAPPING or
> something?

I'm thinking as follows. What do you think?

- Allow mapping larger than requested with gmem_max_level hook:
Depend on the following patch. [1]
The gmem_max_level hook allows vendor-backend to determine max level.
By default (for default VM or sw-protected), it allows KVM_MAX_HUGEPAGE_LEVEL
mapping. TDX allows only 4KB mapping.

[1] https://lore.kernel.org/kvm/20231230172351.574091-31-michael.roth@xxxxxxx/
[PATCH v11 30/35] KVM: x86: Add gmem hook for determining max NPT mapping level

- Pure mapping without coco operation:
As Sean suggested at [2], make KVM_MAP_MEMORY pure mapping without coco
operation. In the case of TDX, the API doesn't issue TDX specific operation
like TDH.PAGE.ADD() and TDH.EXTEND.MR(). We need TDX specific API.

[2] https://lore.kernel.org/kvm/Ze-XW-EbT9vXaagC@xxxxxxxxxx/

- KVM_MAP_MEMORY on already mapped area potentially with large page:
It succeeds. Not error. It doesn't care whether the GPA is backed by large
page or not. Because the use case is pre-population before guest running, it
doesn't matter if the given GPA was mapped or not, and what large page level
it backs.

Do you want error like -EEXIST?

--
Isaku Yamahata <isaku.yamahata@xxxxxxxxx>