Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes

From: Sean Christopherson
Date: Fri May 19 2023 - 15:58:28 EST


On Fri, May 19, 2023, Nicolas Saenz Julienne wrote:
> Hi Sean,
>
> On Fri May 19, 2023 at 6:23 PM UTC, Sean Christopherson wrote:
> > On Fri, May 19, 2023, Nicolas Saenz Julienne wrote:
> > > Hi,
> > >
> > > On Fri Dec 2, 2022 at 6:13 AM UTC, Chao Peng wrote:
> > >
> > > [...]
> > > > +The user sets the per-page memory attributes to a guest memory range indicated
> > > > +by address/size, and in return KVM adjusts address and size to reflect the
> > > > +actual pages of the memory range have been successfully set to the attributes.
> > > > +If the call returns 0, "address" is updated to the last successful address + 1
> > > > +and "size" is updated to the remaining address size that has not been set
> > > > +successfully. The user should check the return value as well as the size to
> > > > +decide if the operation succeeded for the whole range or not. The user may want
> > > > +to retry the operation with the returned address/size if the previous range was
> > > > +partially successful.
> > > > +
> > > > +Both address and size should be page aligned and the supported attributes can be
> > > > +retrieved with KVM_GET_SUPPORTED_MEMORY_ATTRIBUTES.
> > > > +
> > > > +The "flags" field may be used for future extensions and should be set to 0s.
> > >
> > > We have been looking into adding support for the Hyper-V VSM extensions
> > > which Windows uses to implement Credential Guard. This interface seems
> > > like a good fit for one of its underlying features. I just wanted to
> > > share a bit about it, and see if we can expand it to fit this use-case.
> > > Note that this was already briefly discussed between Sean and Alex some
> > > time ago[1].
> > >
> > > VSM introduces isolated guest execution contexts called Virtual Trust
> > > Levels (VTL) [2]. Each VTL has its own memory access protections,
> > > virtual processors states, interrupt controllers and overlay pages. VTLs
> > > are hierarchical and might enforce memory protections on less privileged
> > > VTLs. Memory protections are enforced on a per-GPA granularity.
> > >
> > > The list of possible protections is:
> > > - No access -- This needs a new memory attribute, I think.
> >
> > No, if KVM provides three bits for READ, WRITE, and EXECUTE, then userspace can
> > get all the possible combinations. E.g. this is RWX=000b
>
> That's not what the current implementation does, when attributes is
> equal 0 it clears the entries from the xarray:
>
> static int kvm_vm_ioctl_set_mem_attributes(struct kvm *kvm,
> struct kvm_memory_attributes *attrs)
> {
>
> entry = attrs->attributes ? xa_mk_value(attrs->attributes) : NULL;
> [...]
> for (i = start; i < end; i++)
> if (xa_err(xa_store(&kvm->mem_attr_array, i, entry,
> GFP_KERNEL_ACCOUNT)))
> break;
> }
>
> >From Documentation/core-api/xarray.rst:
>
> "There is no difference between an entry that has never
> been stored to, one that has been erased and one that has most recently
> had ``NULL`` stored to it."
>
> The way I understood the series, there needs to be a differentiation
> between no attributes (regular page fault) and no-access.

Ah, I see what you're saying. There are multiple ways to solve things without a
"no access" flag while still maintaining an empty xarray for the default case.
E.g. invert the flags to be DENY flags[*], have an internal-only "entry valid" flag,
etc.

[*] I vaguely recall suggesting a "deny" approach somewhere, but I may just be
making things up to make it look like I thought deeply about this ;-)