RE: [RFC 11/11] scsi: storvsc: Support PAGE_SIZE larger than 4K

From: Michael Kelley
Date: Wed Jul 22 2020 - 20:13:16 EST


From: Boqun Feng <boqun.feng@xxxxxxxxx> Sent: Monday, July 20, 2020 6:42 PM
>
> Hyper-V always use 4k page size (HV_HYP_PAGE_SIZE), so when
> communicating with Hyper-V, a guest should always use HV_HYP_PAGE_SIZE
> as the unit for page related data. For storvsc, the data is
> vmbus_packet_mpb_array. And since in scsi_cmnd, sglist of pages (in unit
> of PAGE_SIZE) is used, we need convert pages in the sglist of scsi_cmnd
> into Hyper-V pages in vmbus_packet_mpb_array.
>
> This patch does the conversion by dividing pages in sglist into Hyper-V
> pages, offset and indexes in vmbus_packet_mpb_array are recalculated
> accordingly.
>
> Signed-off-by: Boqun Feng <boqun.feng@xxxxxxxxx>
> ---
> drivers/scsi/storvsc_drv.c | 27 +++++++++++++++++++++------
> 1 file changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
> index fb41636519ee..c54d25f279bc 100644
> --- a/drivers/scsi/storvsc_drv.c
> +++ b/drivers/scsi/storvsc_drv.c
> @@ -1561,7 +1561,7 @@ static int storvsc_queuecommand(struct Scsi_Host *host, struct
> scsi_cmnd *scmnd)
> struct hv_host_device *host_dev = shost_priv(host);
> struct hv_device *dev = host_dev->dev;
> struct storvsc_cmd_request *cmd_request = scsi_cmd_priv(scmnd);
> - int i;
> + int i, j, k;
> struct scatterlist *sgl;
> unsigned int sg_count = 0;
> struct vmscsi_request *vm_srb;
> @@ -1569,6 +1569,8 @@ static int storvsc_queuecommand(struct Scsi_Host *host, struct
> scsi_cmnd *scmnd)
> struct vmbus_packet_mpb_array *payload;
> u32 payload_sz;
> u32 length;
> + int subpage_idx = 0;
> + unsigned int hvpg_count = 0;
>
> if (vmstor_proto_version <= VMSTOR_PROTO_VERSION_WIN8) {
> /*
> @@ -1643,23 +1645,36 @@ static int storvsc_queuecommand(struct Scsi_Host *host, struct
> scsi_cmnd *scmnd)
> payload_sz = sizeof(cmd_request->mpb);
>
> if (sg_count) {
> - if (sg_count > MAX_PAGE_BUFFER_COUNT) {
> + hvpg_count = sg_count * (PAGE_SIZE / HV_HYP_PAGE_SIZE);

The above calculation doesn't take into account the offset in the
first sglist or the overall length of the transfer, so the value of hvpg_count
could be quite a bit bigger than it needs to be. For example, with a 64K
page size and an 8 Kbyte transfer size that starts at offset 60K in the
first page, hvpg_count will be 32 when it really only needs to be 2.

The nested loops below that populate the pfn_array take the
offset into account when starting, so that's good. But it will potentially
leave allocated entries unused. Furthermore, the nested loops could
terminate early when enough Hyper-V size pages are mapped to PFNs
based on the length of the transfer, even if all of the last guest size
page has not been mapped to PFNs. Like the offset at the beginning of
first guest size page in the sglist, there's potentially an unused portion
at the end of the last guest size page in the sglist.

> + if (hvpg_count > MAX_PAGE_BUFFER_COUNT) {
>
> - payload_sz = (sg_count * sizeof(u64) +
> + payload_sz = (hvpg_count * sizeof(u64) +
> sizeof(struct vmbus_packet_mpb_array));
> payload = kzalloc(payload_sz, GFP_ATOMIC);
> if (!payload)
> return SCSI_MLQUEUE_DEVICE_BUSY;
> }
>
> + /*
> + * sgl is a list of PAGEs, and payload->range.pfn_array
> + * expects the page number in the unit of HV_HYP_PAGE_SIZE (the
> + * page size that Hyper-V uses, so here we need to divide PAGEs
> + * into HV_HYP_PAGE in case that PAGE_SIZE > HV_HYP_PAGE_SIZE.
> + */
> payload->range.len = length;
> - payload->range.offset = sgl[0].offset;
> + payload->range.offset = sgl[0].offset & ~HV_HYP_PAGE_MASK;
> + subpage_idx = sgl[0].offset >> HV_HYP_PAGE_SHIFT;
>
> cur_sgl = sgl;
> + k = 0;
> for (i = 0; i < sg_count; i++) {
> - payload->range.pfn_array[i] =
> - page_to_pfn(sg_page((cur_sgl)));
> + for (j = subpage_idx; j < (PAGE_SIZE / HV_HYP_PAGE_SIZE); j++) {

In the case where PAGE_SIZE == HV_HYP_PAGE_SIZE, would it help the compiler
eliminate the loop if local variable j is declared as unsigned? In that case the test in the
for statement will always be false.

> + payload->range.pfn_array[k] =
> + page_to_hvpfn(sg_page((cur_sgl))) + j;
> + k++;
> + }
> cur_sgl = sg_next(cur_sgl);
> + subpage_idx = 0;
> }
> }
>
> --
> 2.27.0