RE: [RFC 03/11] Drivers: hv: vmbus: Introduce types of GPADL

From: Michael Kelley
Date: Wed Jul 22 2020 - 19:25:24 EST


From: Boqun Feng <boqun.feng@xxxxxxxxx> Sent: Monday, July 20, 2020 6:41 PM
>
> This patch introduces two types of GPADL: HV_GPADL_{BUFFER, RING}. The
> types of GPADL are purely the concept in the guest, IOW the hypervisor
> treat them as the same.
>
> The reason of introducing the types of GPADL is to support guests whose
> page size is not 4k (the page size of Hyper-V hypervisor). In these
> guests, both the headers and the data parts of the ringbuffers need to
> be aligned to the PAGE_SIZE, because 1) some of the ringbuffers will be
> mapped into userspace and 2) we use "double mapping" mechanism to
> support fast wrap-around, and "double mapping" relies on ringbuffers
> being page-aligned. However, the Hyper-V hypervisor only uses 4k
> (HV_HYP_PAGE_SIZE) headers. Our solution to this is that we always make
> the headers of ringbuffers take one guest page and when GPADL is
> established between the guest and hypervisor, the only first 4k of
> header is used. To handle this special case, we need the types of GPADL
> to differ different guest memory usage for GPADL.
>
> Type enum is introduced along with several general interfaces to
> describe the differences between normal buffer GPADL and ringbuffer
> GPADL.
>
> Signed-off-by: Boqun Feng <boqun.feng@xxxxxxxxx>
> ---
> drivers/hv/channel.c | 140 +++++++++++++++++++++++++++++++++++------
> include/linux/hyperv.h | 44 ++++++++++++-
> 2 files changed, 164 insertions(+), 20 deletions(-)

[snip]

>
>
> @@ -437,7 +528,17 @@ static int __vmbus_open(struct vmbus_channel *newchannel,
> open_msg->openid = newchannel->offermsg.child_relid;
> open_msg->child_relid = newchannel->offermsg.child_relid;
> open_msg->ringbuffer_gpadlhandle = newchannel->ringbuffer_gpadlhandle;
> - open_msg->downstream_ringbuffer_pageoffset = newchannel-
> >ringbuffer_send_offset;
> + /*
> + * The unit of ->downstream_ringbuffer_pageoffset is HV_HYP_PAGE and
> + * the unit of ->ringbuffer_send_offset is PAGE, so here we first
> + * calculate it into bytes and then convert into HV_HYP_PAGE. Also
> + * ->ringbuffer_send_offset is the offset in guest, while
> + * ->downstream_ringbuffer_pageoffset is the offset in gpadl (i.e. in
> + * hypervisor), so a (PAGE_SIZE - HV_HYP_PAGE_SIZE) gap need to be
> + * skipped.
> + */
> + open_msg->downstream_ringbuffer_pageoffset =
> + ((newchannel->ringbuffer_send_offset << PAGE_SHIFT) - (PAGE_SIZE -
> HV_HYP_PAGE_SIZE)) >> HV_HYP_PAGE_SHIFT;

I couldn't find that the "downstream_ringbuffer_pageoffset" field
is used anywhere. Can it just be deleted entirely instead of having
this really messy calculation?

> open_msg->target_vp = newchannel->target_vp;
>
> if (userdatalen)
> @@ -497,6 +598,7 @@ static int __vmbus_open(struct vmbus_channel *newchannel,
> return err;
> }
>
> +

Spurious add of a blank line?

> /*
> * vmbus_connect_ring - Open the channel but reuse ring buffer
> */
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index 692c89ccf5df..663f0a016237 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -29,6 +29,48 @@
>
> #pragma pack(push, 1)
>
> +/*
> + * Types for GPADL, decides is how GPADL header is created.
> + *
> + * It doesn't make much difference between BUFFER and RING if PAGE_SIZE is the
> + * same as HV_HYP_PAGE_SIZE.
> + *
> + * If PAGE_SIZE is bigger than HV_HYP_PAGE_SIZE, the headers of ring buffers
> + * will be of PAGE_SIZE, however, only the first HV_HYP_PAGE will be put
> + * into gpadl, therefore the number for HV_HYP_PAGE and the indexes of each
> + * HV_HYP_PAGE will be different between different types of GPADL, for example
> + * if PAGE_SIZE is 64K:
> + *
> + * BUFFER:
> + *
> + * gva: |-- 64k --|-- 64k --| ... |
> + * gpa: | 4k | 4k | ... | 4k | 4k | 4k | ... | 4k |
> + * index: 0 1 2 15 16 17 18 .. 31 32 ...
> + * | | ... | | | ... | ...
> + * v V V V V V
> + * gpadl: | 4k | 4k | ... | 4k | 4k | 4k | ... | 4k | ... |
> + * index: 0 1 2 ... 15 16 17 18 .. 31 32 ...
> + *
> + * RING:
> + *
> + * | header | data | header | data |
> + * gva: |-- 64k --|-- 64k --| ... |-- 64k --|-- 64k --| ... |
> + * gpa: | 4k | .. | 4k | 4k | ... | 4k | ... | 4k | .. | 4k | .. | ... |
> + * index: 0 1 16 17 18 31 ... n n+1 n+16 ... 2n
> + * | / / / | / /
> + * | / / / | / /
> + * | / / ... / ... | / ... /
> + * | / / / | / /
> + * | / / / | / /
> + * V V V V V V v
> + * gpadl: | 4k | 4k | ... | ... | 4k | 4k | ... |
> + * index: 0 1 2 ... 16 ... n-15 n-14 n-13 ... 2n-30
> + */
> +enum hv_gpadl_type {
> + HV_GPADL_BUFFER,
> + HV_GPADL_RING
> +};
> +
> /* Single-page buffer */
> struct hv_page_buffer {
> u32 len;
> @@ -111,7 +153,7 @@ struct hv_ring_buffer {
> } feature_bits;
>
> /* Pad it to PAGE_SIZE so that data starts on page boundary */
> - u8 reserved2[4028];
> + u8 reserved2[PAGE_SIZE - 68];
>
> /*
> * Ring data starts here + RingDataStartOffset
> --
> 2.27.0