Re: [PATCH 2/2] vdpa: mlx5: synchronize driver status with CVQ

From: Jason Wang
Date: Mon Mar 21 2022 - 03:36:37 EST


On Mon, Mar 21, 2022 at 3:24 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
>
> On Mon, Mar 21, 2022 at 02:04:29PM +0800, Jason Wang wrote:
> > Currently, CVQ doesn't have any synchronization with the driver
> > status. Then CVQ emulation code run in the middle of:
> >
> > 1) device reset
> > 2) device status changed
> > 3) map updating
> >
> > The will lead several unexpected issue like trying to execute CVQ
> > command after the driver has been teared down.
> >
> > Fixing this by using reslock to synchronize CVQ emulation code with
> > the driver status changing:
> >
> > - protect the whole device reset, status changing and map updating
> > with reslock
> > - protect the CVQ handler with the reslock and check
> > VIRTIO_CONFIG_S_DRIVER_OK in the CVQ handler
> >
> > This will guarantee that:
> >
> > 1) CVQ handler won't work if VIRTIO_CONFIG_S_DRIVER_OK is not set
> > 2) CVQ handler will see a consistent state of the driver instead of
> > the partial one when it is running in the middle of the
> > teardown_driver() or setup_driver().
> >
> > Cc: 5262912ef3cfc ("vdpa/mlx5: Add support for control VQ and MAC setting")
> > Signed-off-by: Jason Wang <jasowang@xxxxxxxxxx>
> > ---
> > drivers/vdpa/mlx5/net/mlx5_vnet.c | 42 +++++++++++++++++++++++--------
> > 1 file changed, 31 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > index d5a6fb3f9c41..524240f55c1c 100644
> > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > @@ -1618,11 +1618,17 @@ static void mlx5_cvq_kick_handler(struct work_struct *work)
> > mvdev = wqent->mvdev;
> > ndev = to_mlx5_vdpa_ndev(mvdev);
> > cvq = &mvdev->cvq;
> > +
> > + mutex_lock(&ndev->reslock);
> > +
> > + if (!(mvdev->status & VIRTIO_CONFIG_S_DRIVER_OK))
> > + goto done;
> > +
> > if (!(ndev->mvdev.actual_features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ)))
> > - return;
> > + goto done;
> >
> > if (!cvq->ready)
> > - return;
> > + goto done;
> >
> > while (true) {
> > err = vringh_getdesc_iotlb(&cvq->vring, &cvq->riov, &cvq->wiov, &cvq->head,
> > @@ -1663,6 +1669,9 @@ static void mlx5_cvq_kick_handler(struct work_struct *work)
> > break;
> > }
> > }
> > +
> > +done:
> > + mutex_unlock(&ndev->reslock);
> > }
> >
> > static void mlx5_vdpa_kick_vq(struct vdpa_device *vdev, u16 idx)
> > @@ -2125,6 +2134,8 @@ static int mlx5_vdpa_change_map(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb
> > struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
> > int err;
> >
> > + mutex_lock(&ndev->reslock);
> > +
> > suspend_vqs(ndev);
> > err = save_channels_info(ndev);
> > if (err)
> > @@ -2137,18 +2148,20 @@ static int mlx5_vdpa_change_map(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb
> > goto err_mr;
> >
> > if (!(mvdev->status & VIRTIO_CONFIG_S_DRIVER_OK))
> > - return 0;
> > + goto err_mr;
> >
> > restore_channels_info(ndev);
> > err = setup_driver(mvdev);
> > if (err)
> > goto err_setup;
> >
> > + mutex_unlock(&ndev->reslock);
> > return 0;
> >
> > err_setup:
> > mlx5_vdpa_destroy_mr(mvdev);
> > err_mr:
> > + mutex_unlock(&ndev->reslock);
> > return err;
> > }
> >
> > @@ -2157,7 +2170,8 @@ static int setup_driver(struct mlx5_vdpa_dev *mvdev)
> > struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
> > int err;
> >
> > - mutex_lock(&ndev->reslock);
> > + WARN_ON(!mutex_is_locked(&ndev->reslock));
> > +
> > if (ndev->setup) {
> > mlx5_vdpa_warn(mvdev, "setup driver called for already setup driver\n");
> > err = 0;
>
>
> Maybe also add a comment near function header explaining this must be
> called with lock held.

Will do.

Thanks

>