Re: [PATCH v6 11/14] KVM: x86: Introduce KVM_SET_PAGE_ENC_BITMAP ioctl
From: Steve Rutherford
Date: Fri Apr 10 2020 - 14:08:54 EST
On Thu, Apr 9, 2020 at 6:23 PM Ashish Kalra <ashish.kalra@xxxxxxx> wrote:
>
> Hello Steve,
>
> On Thu, Apr 09, 2020 at 05:06:21PM -0700, Steve Rutherford wrote:
> > On Tue, Apr 7, 2020 at 6:49 PM Ashish Kalra <ashish.kalra@xxxxxxx> wrote:
> > >
> > > Hello Steve,
> > >
> > > On Tue, Apr 07, 2020 at 05:26:33PM -0700, Steve Rutherford wrote:
> > > > On Sun, Mar 29, 2020 at 11:23 PM Ashish Kalra <Ashish.Kalra@xxxxxxx> wrote:
> > > > >
> > > > > From: Brijesh Singh <Brijesh.Singh@xxxxxxx>
> > > > >
> > > > > The ioctl can be used to set page encryption bitmap for an
> > > > > incoming guest.
> > > > >
> > > > > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > > > > Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> > > > > Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
> > > > > Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> > > > > Cc: "Radim KrÄmÃÅ" <rkrcmar@xxxxxxxxxx>
> > > > > Cc: Joerg Roedel <joro@xxxxxxxxxx>
> > > > > Cc: Borislav Petkov <bp@xxxxxxx>
> > > > > Cc: Tom Lendacky <thomas.lendacky@xxxxxxx>
> > > > > Cc: x86@xxxxxxxxxx
> > > > > Cc: kvm@xxxxxxxxxxxxxxx
> > > > > Cc: linux-kernel@xxxxxxxxxxxxxxx
> > > > > Signed-off-by: Brijesh Singh <brijesh.singh@xxxxxxx>
> > > > > Signed-off-by: Ashish Kalra <ashish.kalra@xxxxxxx>
> > > > > ---
> > > > > Documentation/virt/kvm/api.rst | 22 +++++++++++++++++
> > > > > arch/x86/include/asm/kvm_host.h | 2 ++
> > > > > arch/x86/kvm/svm.c | 42 +++++++++++++++++++++++++++++++++
> > > > > arch/x86/kvm/x86.c | 12 ++++++++++
> > > > > include/uapi/linux/kvm.h | 1 +
> > > > > 5 files changed, 79 insertions(+)
> > > > >
> > > > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > > > > index 8ad800ebb54f..4d1004a154f6 100644
> > > > > --- a/Documentation/virt/kvm/api.rst
> > > > > +++ b/Documentation/virt/kvm/api.rst
> > > > > @@ -4675,6 +4675,28 @@ or shared. The bitmap can be used during the guest migration, if the page
> > > > > is private then userspace need to use SEV migration commands to transmit
> > > > > the page.
> > > > >
> > > > > +4.126 KVM_SET_PAGE_ENC_BITMAP (vm ioctl)
> > > > > +---------------------------------------
> > > > > +
> > > > > +:Capability: basic
> > > > > +:Architectures: x86
> > > > > +:Type: vm ioctl
> > > > > +:Parameters: struct kvm_page_enc_bitmap (in/out)
> > > > > +:Returns: 0 on success, -1 on error
> > > > > +
> > > > > +/* for KVM_SET_PAGE_ENC_BITMAP */
> > > > > +struct kvm_page_enc_bitmap {
> > > > > + __u64 start_gfn;
> > > > > + __u64 num_pages;
> > > > > + union {
> > > > > + void __user *enc_bitmap; /* one bit per page */
> > > > > + __u64 padding2;
> > > > > + };
> > > > > +};
> > > > > +
> > > > > +During the guest live migration the outgoing guest exports its page encryption
> > > > > +bitmap, the KVM_SET_PAGE_ENC_BITMAP can be used to build the page encryption
> > > > > +bitmap for an incoming guest.
> > > > >
> > > > > 5. The kvm_run structure
> > > > > ========================
> > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > > > > index 27e43e3ec9d8..d30f770aaaea 100644
> > > > > --- a/arch/x86/include/asm/kvm_host.h
> > > > > +++ b/arch/x86/include/asm/kvm_host.h
> > > > > @@ -1271,6 +1271,8 @@ struct kvm_x86_ops {
> > > > > unsigned long sz, unsigned long mode);
> > > > > int (*get_page_enc_bitmap)(struct kvm *kvm,
> > > > > struct kvm_page_enc_bitmap *bmap);
> > > > > + int (*set_page_enc_bitmap)(struct kvm *kvm,
> > > > > + struct kvm_page_enc_bitmap *bmap);
> > > > > };
> > > > >
> > > > > struct kvm_arch_async_pf {
> > > > > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> > > > > index bae783cd396a..313343a43045 100644
> > > > > --- a/arch/x86/kvm/svm.c
> > > > > +++ b/arch/x86/kvm/svm.c
> > > > > @@ -7756,6 +7756,47 @@ static int svm_get_page_enc_bitmap(struct kvm *kvm,
> > > > > return ret;
> > > > > }
> > > > >
> > > > > +static int svm_set_page_enc_bitmap(struct kvm *kvm,
> > > > > + struct kvm_page_enc_bitmap *bmap)
> > > > > +{
> > > > > + struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> > > > > + unsigned long gfn_start, gfn_end;
> > > > > + unsigned long *bitmap;
> > > > > + unsigned long sz, i;
> > > > > + int ret;
> > > > > +
> > > > > + if (!sev_guest(kvm))
> > > > > + return -ENOTTY;
> > > > > +
> > > > > + gfn_start = bmap->start_gfn;
> > > > > + gfn_end = gfn_start + bmap->num_pages;
> > > > > +
> > > > > + sz = ALIGN(bmap->num_pages, BITS_PER_LONG) / 8;
> > > > > + bitmap = kmalloc(sz, GFP_KERNEL);
> > > > > + if (!bitmap)
> > > > > + return -ENOMEM;
> > > > > +
> > > > > + ret = -EFAULT;
> > > > > + if (copy_from_user(bitmap, bmap->enc_bitmap, sz))
> > > > > + goto out;
> > > > > +
> > > > > + mutex_lock(&kvm->lock);
> > > > > + ret = sev_resize_page_enc_bitmap(kvm, gfn_end);
> > > > I realize now that usermode could use this for initializing the
> > > > minimum size of the enc bitmap, which probably solves my issue from
> > > > the other thread.
> > > > > + if (ret)
> > > > > + goto unlock;
> > > > > +
> > > > > + i = gfn_start;
> > > > > + for_each_clear_bit_from(i, bitmap, (gfn_end - gfn_start))
> > > > > + clear_bit(i + gfn_start, sev->page_enc_bmap);
> > > > This API seems a bit strange, since it can only clear bits. I would
> > > > expect "set" to force the values to match the values passed down,
> > > > instead of only ensuring that cleared bits in the input are also
> > > > cleared in the kernel.
> > > >
> > >
> > > The sev_resize_page_enc_bitmap() will allocate a new bitmap and
> > > set it to all 0xFF's, therefore, the code here simply clears the bits
> > > in the bitmap as per the cleared bits in the input.
> >
> > If I'm not mistaken, resize only reinitializes the newly extended part
> > of the buffer, and copies the old values for the rest.
> > With the API you proposed you could probably reimplement a normal set
> > call by calling get, then reset, and then set, but this feels
> > cumbersome.
> >
>
> As i mentioned earlier, the set api is basically meant for the incoming
> VM, the resize will initialize the incoming VM's bitmap to all 0xFF's
> and as there won't be any bitmap allocated initially on the incoming VM,
> therefore, the bitmap copy will not do anything and the clear_bit later
> will clear the incoming VM's bits as per the input.
The documentation does not make that super clear. A typical set call
in the KVM API let's you go to any state, not just a subset of states.
Yes, this works in the common case of migrating a VM to a particular
target, once. I find the behavior of the current API surprising. I
prefer APIs that are unsurprising. If I were to not have read the
code, it would be very easy for me to have assumed it worked like a
normal set call. You could rename the ioctl something like
"CLEAR_BITS", but a set based API is more common.
Thanks,
Steve