Re: [PATCH v2 3/9] KVM: x86: Introduce KVM_GET_PAGE_ENC_BITMAP ioctl

From: Ashish Kalra
Date: Mon Dec 07 2020 - 17:01:31 EST


Hello Dov,

On Sun, Dec 06, 2020 at 01:02:47PM +0200, Dov Murik wrote:
>
>
> On 01/12/2020 2:47, Ashish Kalra wrote:
> > From: Brijesh Singh <brijesh.singh@xxxxxxx>
> >
> > The ioctl can be used to retrieve page encryption bitmap for a given
> > gfn range.
> >
> > Return the correct bitmap as per the number of pages being requested
> > by the user. Ensure that we only copy bmap->num_pages bytes in the
> > userspace buffer, if bmap->num_pages is not byte aligned we read
> > the trailing bits from the userspace and copy those bits as is.
>
> I think you meant to say "Ensure that we only copy bmap->num_pages *bits* in
> the userspace buffer". But maybe I'm missed something.
>

Yes, that is correct. It should read bmap->num_pages *bits* instead of
*bytes*, i will fix the comments.

>
> >
> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> > Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
> > Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> > Cc: "Radim Krčmář" <rkrcmar@xxxxxxxxxx>
> > Cc: Joerg Roedel <joro@xxxxxxxxxx>
> > Cc: Borislav Petkov <bp@xxxxxxx>
> > Cc: Tom Lendacky <thomas.lendacky@xxxxxxx>
> > Cc: x86@xxxxxxxxxx
> > Cc: kvm@xxxxxxxxxxxxxxx
> > Cc: linux-kernel@xxxxxxxxxxxxxxx
> > Reviewed-by: Venu Busireddy <venu.busireddy@xxxxxxxxxx>
> > Signed-off-by: Brijesh Singh <brijesh.singh@xxxxxxx>
> > Signed-off-by: Ashish Kalra <ashish.kalra@xxxxxxx>
> > ---
> > Documentation/virt/kvm/api.rst | 27 +++++++++++++
> > arch/x86/include/asm/kvm_host.h | 2 +
> > arch/x86/kvm/svm/sev.c | 70 +++++++++++++++++++++++++++++++++
> > arch/x86/kvm/svm/svm.c | 1 +
> > arch/x86/kvm/svm/svm.h | 1 +
> > arch/x86/kvm/x86.c | 12 ++++++
> > include/uapi/linux/kvm.h | 12 ++++++
> > 7 files changed, 125 insertions(+)
> >
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index 70254eaa5229..ae410f4332ab 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -4671,6 +4671,33 @@ This ioctl resets VCPU registers and control structures according to
> > the clear cpu reset definition in the POP. However, the cpu is not put
> > into ESA mode. This reset is a superset of the initial reset.
> >
> > +4.125 KVM_GET_PAGE_ENC_BITMAP (vm ioctl)
> > +---------------------------------------
> > +
> > +:Capability: basic
> > +:Architectures: x86
> > +:Type: vm ioctl
> > +:Parameters: struct kvm_page_enc_bitmap (in/out)
> > +:Returns: 0 on success, -1 on error
> > +
> > +/* for KVM_GET_PAGE_ENC_BITMAP */
> > +struct kvm_page_enc_bitmap {
> > + __u64 start_gfn;
> > + __u64 num_pages;
> > + union {
> > + void __user *enc_bitmap; /* one bit per page */
> > + __u64 padding2;
> > + };
> > +};
> > +
> > +The encrypted VMs have the concept of private and shared pages. The private
> > +pages are encrypted with the guest-specific key, while the shared pages may
> > +be encrypted with the hypervisor key. The KVM_GET_PAGE_ENC_BITMAP can
> > +be used to get the bitmap indicating whether the guest page is private
> > +or shared. The bitmap can be used during the guest migration. If the page
> > +is private then the userspace need to use SEV migration commands to transmit
> > +the page.
> > +
> >
> > 4.125 KVM_S390_PV_COMMAND
> > -------------------------
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index d035dc983a7a..8c2e40199ecb 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1284,6 +1284,8 @@ struct kvm_x86_ops {
> > void (*msr_filter_changed)(struct kvm_vcpu *vcpu);
> > int (*page_enc_status_hc)(struct kvm *kvm, unsigned long gpa,
> > unsigned long sz, unsigned long mode);
> > + int (*get_page_enc_bitmap)(struct kvm *kvm,
> > + struct kvm_page_enc_bitmap *bmap);
> > };
> >
> > struct kvm_x86_nested_ops {
> > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > index 6b8bc1297f9c..a6586dd29767 100644
> > --- a/arch/x86/kvm/svm/sev.c
> > +++ b/arch/x86/kvm/svm/sev.c
> > @@ -1014,6 +1014,76 @@ int svm_page_enc_status_hc(struct kvm *kvm, unsigned long gpa,
> > return 0;
> > }
> >
> > +int svm_get_page_enc_bitmap(struct kvm *kvm,
> > + struct kvm_page_enc_bitmap *bmap)
> > +{
> > + struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> > + unsigned long gfn_start, gfn_end;
> > + unsigned long sz, i, sz_bytes;
> > + unsigned long *bitmap;
> > + int ret, n;
> > +
> > + if (!sev_guest(kvm))
> > + return -ENOTTY;
> > +
> > + gfn_start = bmap->start_gfn;
> > + gfn_end = gfn_start + bmap->num_pages;
> > +
> > + sz = ALIGN(bmap->num_pages, BITS_PER_LONG) / BITS_PER_BYTE;
> > + bitmap = kmalloc(sz, GFP_KERNEL);
>
> Maybe use bitmap_alloc which accepts size in bits (and corresponding
> bitmap_free)?
>

I will look at this.

>
> > + if (!bitmap)
> > + return -ENOMEM;
> > +
> > + /* by default all pages are marked encrypted */
> > + memset(bitmap, 0xff, sz);
>
> Maybe use bitmap_fill to clarify the intent?
>

Again, i will look at this.
>
> > +
> > + mutex_lock(&kvm->lock);
> > + if (sev->page_enc_bmap) {
> > + i = gfn_start;
> > + for_each_clear_bit_from(i, sev->page_enc_bmap,
> > + min(sev->page_enc_bmap_size, gfn_end))
> > + clear_bit(i - gfn_start, bitmap);
> > + }
> > + mutex_unlock(&kvm->lock);
> > +
> > + ret = -EFAULT;
> > +
> > + n = bmap->num_pages % BITS_PER_BYTE;
> > + sz_bytes = ALIGN(bmap->num_pages, BITS_PER_BYTE) / BITS_PER_BYTE;
>
> Maybe clearer:
>
> sz_bytes = BITS_TO_BYTES(bmap->num_pages);
>
>
>
> > +
> > + /*
> > + * Return the correct bitmap as per the number of pages being
> > + * requested by the user. Ensure that we only copy bmap->num_pages
> > + * bytes in the userspace buffer, if bmap->num_pages is not byte
> > + * aligned we read the trailing bits from the userspace and copy
> > + * those bits as is.
> > + */
>
> (see my comment on the commit message above.)
>
Yes, as i mentioned above, this need to be bmap->num pages *bits* and
not *bytes*.

>
> > +
> > + if (n) {
> > + unsigned char *bitmap_kernel = (unsigned char *)bitmap;
> > + unsigned char bitmap_user;
> > + unsigned long offset, mask;
> > +
> > + offset = bmap->num_pages / BITS_PER_BYTE;
> > + if (copy_from_user(&bitmap_user, bmap->enc_bitmap + offset,
> > + sizeof(unsigned char)))
> > + goto out;
> > +
> > + mask = GENMASK(n - 1, 0);
> > + bitmap_user &= ~mask;
> > + bitmap_kernel[offset] &= mask;
> > + bitmap_kernel[offset] |= bitmap_user;
> > + }
> > +
> > + if (copy_to_user(bmap->enc_bitmap, bitmap, sz_bytes))
> > + goto out;
> > +
> > + ret = 0;
> > +out:
> > + kfree(bitmap);
> > + return ret;
> > +}
> > +
> > int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
> > {
> > struct kvm_sev_cmd sev_cmd;
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index 7122ea5f7c47..bff89cab3ed0 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -4314,6 +4314,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
> > .msr_filter_changed = svm_msr_filter_changed,
> >
> > .page_enc_status_hc = svm_page_enc_status_hc,
> > + .get_page_enc_bitmap = svm_get_page_enc_bitmap,
> > };
> >
> > static struct kvm_x86_init_ops svm_init_ops __initdata = {
> > diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> > index 0103a23ca174..4ce73f1034b9 100644
> > --- a/arch/x86/kvm/svm/svm.h
> > +++ b/arch/x86/kvm/svm/svm.h
> > @@ -413,6 +413,7 @@ int nested_svm_exit_special(struct vcpu_svm *svm);
> > void sync_nested_vmcb_control(struct vcpu_svm *svm);
> > int svm_page_enc_status_hc(struct kvm *kvm, unsigned long gpa,
> > unsigned long npages, unsigned long enc);
> > +int svm_get_page_enc_bitmap(struct kvm *kvm, struct kvm_page_enc_bitmap *bmap);
> >
> > extern struct kvm_x86_nested_ops svm_nested_ops;
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 3afc78f18f69..d3cb95a4dd55 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -5695,6 +5695,18 @@ long kvm_arch_vm_ioctl(struct file *filp,
> > case KVM_X86_SET_MSR_FILTER:
> > r = kvm_vm_ioctl_set_msr_filter(kvm, argp);
> > break;
> > + case KVM_GET_PAGE_ENC_BITMAP: {
> > + struct kvm_page_enc_bitmap bitmap;
> > +
> > + r = -EFAULT;
> > + if (copy_from_user(&bitmap, argp, sizeof(bitmap)))
> > + goto out;
> > +
> > + r = -ENOTTY;
> > + if (kvm_x86_ops.get_page_enc_bitmap)
> > + r = kvm_x86_ops.get_page_enc_bitmap(kvm, &bitmap);
> > + break;
> > + }
> > default:
> > r = -ENOTTY;
> > }
> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> > index 886802b8ffba..d0b9171bdb03 100644
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -532,6 +532,16 @@ struct kvm_dirty_log {
> > };
> > };
> >
> > +/* for KVM_GET_PAGE_ENC_BITMAP */
> > +struct kvm_page_enc_bitmap {
> > + __u64 start_gfn;
> > + __u64 num_pages;
> > + union {
> > + void __user *enc_bitmap; /* one bit per page */
> > + __u64 padding2;
> > + };
> > +};
> > +
> > /* for KVM_CLEAR_DIRTY_LOG */
> > struct kvm_clear_dirty_log {
> > __u32 slot;
> > @@ -1563,6 +1573,8 @@ struct kvm_pv_cmd {
> > /* Available with KVM_CAP_DIRTY_LOG_RING */
> > #define KVM_RESET_DIRTY_RINGS _IO(KVMIO, 0xc7)
> >
> > +#define KVM_GET_PAGE_ENC_BITMAP _IOW(KVMIO, 0xc6, struct kvm_page_enc_bitmap)
>
> I see that kvm/next already defines ioctls numbered 0xc6 and 0xc7. Wouldn't
> these new ioctls (KVM_GET_PAGE_ENC_BITMAP, KVM_SET_PAGE_ENC_BITMAP) collide?
>

Yes, but they will be fixed for the next version of the patch-set i am
going to post.

Thanks for your feedback.
Ashish

>
> > +
> > /* Secure Encrypted Virtualization command */
> > enum sev_cmd_id {
> > /* Guest initialization commands */
> >
>
> -Dov