RE: [PATCH] scsi: storvsc: Prefer returning channel with the same CPU as on the I/O issuing CPU

From: Long Li

Date: Tue Oct 07 2025 - 20:56:00 EST




> -----Original Message-----
> From: Michael Kelley <mhklinux@xxxxxxxxxxx>
> Sent: Tuesday, October 7, 2025 8:42 AM
> To: longli@xxxxxxxxxxxxxxxxxxx; KY Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang
> Zhang <haiyangz@xxxxxxxxxxxxx>; Wei Liu <wei.liu@xxxxxxxxxx>; Dexuan Cui
> <decui@xxxxxxxxxxxxx>; James E.J. Bottomley
> <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx>; Martin K. Petersen
> <martin.petersen@xxxxxxxxxx>; James Bottomley <JBottomley@xxxxxxxx>;
> linux-hyperv@xxxxxxxxxxxxxxx; linux-scsi@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx
> Cc: Long Li <longli@xxxxxxxxxxxxx>
> Subject: [EXTERNAL] RE: [PATCH] scsi: storvsc: Prefer returning channel with the
> same CPU as on the I/O issuing CPU
>
> From: longli@xxxxxxxxxxxxxxxxxxx <longli@xxxxxxxxxxxxxxxxxxx> Sent:
> Wednesday, October 1, 2025 10:06 PM
> >
> > When selecting an outgoing channel for I/O, storvsc tries to select a
> > channel with a returning CPU that is not the same as issuing CPU. This
> > worked well in the past, however it doesn't work well when the Hyper-V
> > exposes a large number of channels (up to the number of all CPUs). Use
> > a different CPU for returning channel is not efficient on Hyper-V.
> >
> > Change this behavior by preferring to the channel with the same CPU as
> > the current I/O issuing CPU whenever possible.
> >
> > Tests have shown improvements in newer Hyper-V/Azure environment, and
> > no regression with older Hyper-V/Azure environments.
> >
> > Tested-by: Raheel Abdul Faizy <rabdulfaizy@xxxxxxxxxxxxx>
> > Signed-off-by: Long Li <longli@xxxxxxxxxxxxx>
> > ---
> > drivers/scsi/storvsc_drv.c | 96
> > ++++++++++++++++++--------------------
> > 1 file changed, 45 insertions(+), 51 deletions(-)
> >
> > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
> > index d9e59204a9c3..092939791ea0 100644
> > --- a/drivers/scsi/storvsc_drv.c
> > +++ b/drivers/scsi/storvsc_drv.c
> > @@ -1406,14 +1406,19 @@ static struct vmbus_channel *get_og_chn(struct
> storvsc_device *stor_device,
> > }
> >
> > /*
> > - * Our channel array is sparsley populated and we
> > + * Our channel array could be sparsley populated and we
> > * initiated I/O on a processor/hw-q that does not
> > * currently have a designated channel. Fix this.
> > * The strategy is simple:
> > - * I. Ensure NUMA locality
> > - * II. Distribute evenly (best effort)
> > + * I. Prefer the channel associated with the current CPU
> > + * II. Ensure NUMA locality
> > + * III. Distribute evenly (best effort)
> > */
> >
> > + /* Prefer the channel on the I/O issuing processor/hw-q */
> > + if (cpumask_test_cpu(q_num, &stor_device->alloced_cpus))
> > + return stor_device->stor_chns[q_num];
> > +
>
> Hmmm. When get_og_chn() is called, we know that stor_device-
> >stor_chns[q_num] is NULL since storvsc_do_io() has already handled the non-
> NULL case. And the checks are all done with stor_device->lock held, so the
> stor_chns array can't change.
> Hence the above code will return NULL, which will cause a NULL reference when
> storvsc_do_io() sends out the VMBus packet.
>
> My recollection is that get_og_chan() is called when there is no channel that
> interrupts the current CPU (that's what it means for stor_device-
> >stor_chns[<current CPU>] to be NULL). So the algorithm must pick a channel
> that interrupts some other CPU, preferably a CPU in the current NUMA node.
> Adding code to prefer the channel associated with the current CPU doesn't make
> sense in get_og_chn(), as get_og_chn() is only called when it is already known
> that there is no such channel.

The initial values for stor_chns[] and alloced_cpus are set in storvsc_channel_init() (for primary channel) and handle_sc_creation() (for subchannels).

As a result, the check for cpumask_test_cpu(q_num, &stor_device->alloced_cpus) will guarantee we are getting a channel. If the check fails, the code follows the old behavior to find a channel.

This check is needed because storvsc supports change_target_cpu_callback() callback via vmbus.

Thanks,

Long

>
> Or is there a case that I'm missing? Regardless, the above code seems
> problematic because it would return NULL.
>
> Michael