Re: [PATCH v3] virtio-net: Fix probe of virtio-net on kvmtool

From: Michael S. Tsirkin
Date: Wed Mar 01 2023 - 09:45:12 EST


On Wed, Mar 01, 2023 at 01:59:52PM +0000, Rob Bradford via B4 Relay wrote:
> From: Rob Bradford <rbradford@xxxxxxxxxxxx>
>
> Since the following commit virtio-net on kvmtool has printed a warning
> during the probe:
>
> commit dbcf24d153884439dad30484a0e3f02350692e4c
> Author: Jason Wang <jasowang@xxxxxxxxxx>
> Date: Tue Aug 17 16:06:59 2021 +0800
>
> virtio-net: use NETIF_F_GRO_HW instead of NETIF_F_LRO
>
> [ 1.865992] net eth0: Fail to set guest offload.
> [ 1.872491] virtio_net virtio2 eth0: set_features() failed (-22); wanted 0x0000000000134829, left 0x0080000000134829
>
> This is because during the probing the underlying netdev device has
> identified that the netdev features on the device has changed and
> attempts to update the virtio-net offloads through the virtio-net
> control queue. kvmtool however does not have a control queue that supports
> offload changing (VIRTIO_NET_F_CTRL_GUEST_OFFLOADS is not advertised)
>
> The netdev features have changed due to validation checks in
> netdev_fix_features():
>
> if (!(features & NETIF_F_RXCSUM)) {
> /* NETIF_F_GRO_HW implies doing RXCSUM since every packet
> * successfully merged by hardware must also have the
> * checksum verified by hardware. If the user does not
> * want to enable RXCSUM, logically, we should disable GRO_HW.
> */
> if (features & NETIF_F_GRO_HW) {
> netdev_dbg(dev, "Dropping NETIF_F_GRO_HW since no RXCSUM feature.\n");
> features &= ~NETIF_F_GRO_HW;
> }
> }
>
> Since kvmtool does not advertise the VIRTIO_NET_F_GUEST_CSUM feature the
> NETIF_F_RXCSUM bit is not present and so the NETIF_F_GRO_HW bit is
> cleared. This results in the netdev features changing, which triggers
> the attempt to reprogram the virtio-net offloads which then fails.
>
> This commit prevents that set of netdev features from changing by
> preemptively applying the same validation and only setting
> NETIF_F_GRO_HW if NETIF_F_RXCSUM is set because the device supports both
> VIRTIO_NET_F_GUEST_CSUM and VIRTIO_NET_F_GUEST_TSO{4,6}
>
> Signed-off-by: Rob Bradford <rbradford@xxxxxxxxxxxx>
> ---
> Changes in v3:
> - Identified root-cause of feature bit changing and updated conditions
> check
> - Link to v2: https://lore.kernel.org/r/20230223-virtio-net-kvmtool-v2-1-8ec93511e67f@xxxxxxxxxxxx
>
> Changes in v2:
> - Use parentheses to group logical OR of features
> - Link to v1:
> https://lore.kernel.org/r/20230223-virtio-net-kvmtool-v1-1-fc23d29b9d7a@xxxxxxxxxxxx
> ---
> drivers/net/virtio_net.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 61e33e4dd0cd..2e7705142ca5 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -3778,11 +3778,13 @@ static int virtnet_probe(struct virtio_device *vdev)
> dev->features |= dev->hw_features & NETIF_F_ALL_TSO;
> /* (!csum && gso) case will be fixed by register_netdev() */
> }
> - if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_CSUM))
> + if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_CSUM)) {
> dev->features |= NETIF_F_RXCSUM;
> - if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4) ||
> - virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO6))
> - dev->features |= NETIF_F_GRO_HW;
> + /* This dependency is enforced by netdev_fix_features */
> + if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4) ||
> + virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO6))
> + dev->features |= NETIF_F_GRO_HW;
> + }
> if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS))
> dev->hw_features |= NETIF_F_GRO_HW;
>

I see. It is annoying that we are duplicating the logic from
netdev_fix_features here though :(
Maybe we should call netdev_update_features, in the callback check
the flags and decide what to set and what to clear?
Or export netdev_fix_features to modules?



Also re-reading Documentation/networking/netdev-features.rst -

1. netdev->hw_features set contains features whose state may possibly
be changed (enabled or disabled) for a particular device by user's
request. This set should be initialized in ndo_init callback and not
changed later.

2. netdev->features set contains features which are currently enabled
for a device. This should be changed only by network core or in
error paths of ndo_set_features callback.


is it then wrong that virtio sets NETIF_F_RXCSUM and NETIF_F_GRO_HW in
dev->features and not in dev->hw_features? We set it there because
without ctrl guest offload these can not be changed.
I suspect this is just a minor documentation bug yes? Maybe devices
where features can't be cleared are uncommon.

Also:
if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS))
dev->hw_features |= NETIF_F_GRO_HW;

but should we not set NETIF_F_RXCSUM there too?



> ---
> base-commit: c39cea6f38eefe356d64d0bc1e1f2267e282cdd3
> change-id: 20230223-virtio-net-kvmtool-87f37515be22
>
> Best regards,
> --
> Rob Bradford <rbradford@xxxxxxxxxxxx>