Re: [PATCH] virtio-net: fix race between ndo_open() and virtio_device_ready()

From: Michael S. Tsirkin
Date: Fri Jun 17 2022 - 06:13:09 EST


On Fri, Jun 17, 2022 at 03:29:49PM +0800, Jason Wang wrote:
> We used to call virtio_device_ready() after netdev registration. This
> cause a race between ndo_open() and virtio_device_ready(): if
> ndo_open() is called before virtio_device_ready(), the driver may
> start to use the device before DRIVER_OK which violates the spec.
>
> Fixing this by switching to use register_netdevice() and protect the
> virtio_device_ready() with rtnl_lock() to make sure ndo_open() can
> only be called after virtio_device_ready().
>
> Fixes: 4baf1e33d0842 ("virtio_net: enable VQs early")
> Signed-off-by: Jason Wang <jasowang@xxxxxxxxxx>
> ---
> drivers/net/virtio_net.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index db05b5e930be..8a5810bcb839 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -3655,14 +3655,20 @@ static int virtnet_probe(struct virtio_device *vdev)
> if (vi->has_rss || vi->has_rss_hash_report)
> virtnet_init_default_rss(vi);
>
> - err = register_netdev(dev);
> + /* serialize netdev register + virtio_device_ready() with ndo_open() */
> + rtnl_lock();
> +
> + err = register_netdevice(dev);
> if (err) {
> pr_debug("virtio_net: registering device failed\n");
> + rtnl_unlock();
> goto free_failover;
> }
>
> virtio_device_ready(vdev);
>
> + rtnl_unlock();
> +
> err = virtnet_cpu_notif_add(vi);
> if (err) {
> pr_debug("virtio_net: registering cpu notifier failed\n");


Looks good but then don't we have the same issue when removing the
device?

Actually I looked at virtnet_remove and I see
unregister_netdev(vi->dev);

net_failover_destroy(vi->failover);

remove_vq_common(vi); <- this will reset the device

a window here?


Really, I think what we had originally was a better idea -
instead of dropping interrupts they were delayed and
when driver is ready to accept them it just enables them.
We just need to make sure driver does not wait for
interrupts before enabling them.

And I suspect we need to make this opt-in on a per driver
basis.



> --
> 2.25.1