Re: [PATCH v2] vdpa/mlx5: Allow CVQ size changes

From: Eugenio Perez Martin
Date: Tue Feb 27 2024 - 05:50:41 EST


On Mon, Feb 19, 2024 at 2:09 AM Lei Yang <leiyang@xxxxxxxxxx> wrote:
>
> QE tested this patch's V2, qemu no longer print error messages
> "qemu-system-x86_64: Insufficient written data (0)" after
> enable/disable multi queues multi times inside guest. Both "x-svq=on
> '' and without it are all test pass.
>
> Tested-by: Lei Yang <leiyang@xxxxxxxxxx>
>
> On Fri, Feb 16, 2024 at 10:25 PM Jonah Palmer <jonah.palmer@xxxxxxxxxx> wrote:
> >
> > The MLX driver was not updating its control virtqueue size at set_vq_num
> > and instead always initialized to MLX5_CVQ_MAX_ENT (16) at
> > setup_cvq_vring.
> >
> > Qemu would try to set the size to 64 by default, however, because the
> > CVQ size always was initialized to 16, an error would be thrown when
> > sending >16 control messages (as used-ring entry 17 is initialized to 0).
> > For example, starting a guest with x-svq=on and then executing the
> > following command would produce the error below:
> >
> > # for i in {1..20}; do ifconfig eth0 hw ether XX:xx:XX:xx:XX:XX; done
> >
> > qemu-system-x86_64: Insufficient written data (0)
> > [ 435.331223] virtio_net virtio0: Failed to set mac address by vq command.
> > SIOCSIFHWADDR: Invalid argument
> >

Also,

Fixes: 5262912ef3cf ("vdpa/mlx5: Add support for control VQ and MAC setting")

> > Acked-by: Dragos Tatulea <dtatulea@xxxxxxxxxx>
> > Acked-by: Eugenio Pérez <eperezma@xxxxxxxxxx>
> > Signed-off-by: Jonah Palmer <jonah.palmer@xxxxxxxxxx>
> > ---
> > drivers/vdpa/mlx5/net/mlx5_vnet.c | 13 +++++++++----
> > 1 file changed, 9 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > index 778821bab7d9..ecfc16151d61 100644
> > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > @@ -151,8 +151,6 @@ static void teardown_driver(struct mlx5_vdpa_net *ndev);
> >
> > static bool mlx5_vdpa_debug;
> >
> > -#define MLX5_CVQ_MAX_ENT 16
> > -
> > #define MLX5_LOG_VIO_FLAG(_feature) \
> > do { \
> > if (features & BIT_ULL(_feature)) \
> > @@ -2276,9 +2274,16 @@ static void mlx5_vdpa_set_vq_num(struct vdpa_device *vdev, u16 idx, u32 num)
> > struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
> > struct mlx5_vdpa_virtqueue *mvq;
> >
> > - if (!is_index_valid(mvdev, idx) || is_ctrl_vq_idx(mvdev, idx))
> > + if (!is_index_valid(mvdev, idx))
> > return;
> >
> > + if (is_ctrl_vq_idx(mvdev, idx)) {
> > + struct mlx5_control_vq *cvq = &mvdev->cvq;
> > +
> > + cvq->vring.vring.num = num;
> > + return;
> > + }
> > +
> > mvq = &ndev->vqs[idx];
> > mvq->num_ent = num;
> > }
> > @@ -2963,7 +2968,7 @@ static int setup_cvq_vring(struct mlx5_vdpa_dev *mvdev)
> > u16 idx = cvq->vring.last_avail_idx;
> >
> > err = vringh_init_iotlb(&cvq->vring, mvdev->actual_features,
> > - MLX5_CVQ_MAX_ENT, false,
> > + cvq->vring.vring.num, false,
> > (struct vring_desc *)(uintptr_t)cvq->desc_addr,
> > (struct vring_avail *)(uintptr_t)cvq->driver_addr,
> > (struct vring_used *)(uintptr_t)cvq->device_addr);
> > --
> > 2.39.3
> >
>

There is another related issue in both mlx and vdpa_sim, although I
think it does not cause any direct bug. They both return a hardcoded
256 in .get_vq_num_max, although they both accept bigger sizes with
set_vq_num.

QEMU just never calls .get_vq_num_max, so it does not forward this
maximum to the guest.

To be aligned with the VirtIO standard it should return the actual
maximum, which I think is only bounded by the uint16_t maximum in the
packed case and the half in the case of split, due to the requisite of
being a power of 2. This is a very big value however, so I think the
right solution is to allow to specify this maximum on vdpa command
line tool.

Moreover, the virtio standard allows the device to set different max Q
size values per virtqueue, something that the vdpa ops does not allow
as it cannot tell between queues, is a per device vdpa_op.

Having said that, maybe it is not worth all the trouble, as it has not
been reported to cause any issue?

Thanks!