Re: [PATCH v3] virtio-net: Fix probe of virtio-net on kvmtool

From: Jason Wang
Date: Thu Mar 02 2023 - 03:11:26 EST


On Wed, Mar 1, 2023 at 10:44 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
>
> On Wed, Mar 01, 2023 at 01:59:52PM +0000, Rob Bradford via B4 Relay wrote:
> > From: Rob Bradford <rbradford@xxxxxxxxxxxx>
> >
> > Since the following commit virtio-net on kvmtool has printed a warning
> > during the probe:
> >
> > commit dbcf24d153884439dad30484a0e3f02350692e4c
> > Author: Jason Wang <jasowang@xxxxxxxxxx>
> > Date: Tue Aug 17 16:06:59 2021 +0800
> >
> > virtio-net: use NETIF_F_GRO_HW instead of NETIF_F_LRO
> >
> > [ 1.865992] net eth0: Fail to set guest offload.
> > [ 1.872491] virtio_net virtio2 eth0: set_features() failed (-22); wanted 0x0000000000134829, left 0x0080000000134829
> >
> > This is because during the probing the underlying netdev device has
> > identified that the netdev features on the device has changed and
> > attempts to update the virtio-net offloads through the virtio-net
> > control queue. kvmtool however does not have a control queue that supports
> > offload changing (VIRTIO_NET_F_CTRL_GUEST_OFFLOADS is not advertised)
> >
> > The netdev features have changed due to validation checks in
> > netdev_fix_features():
> >
> > if (!(features & NETIF_F_RXCSUM)) {
> > /* NETIF_F_GRO_HW implies doing RXCSUM since every packet
> > * successfully merged by hardware must also have the
> > * checksum verified by hardware. If the user does not
> > * want to enable RXCSUM, logically, we should disable GRO_HW.
> > */
> > if (features & NETIF_F_GRO_HW) {
> > netdev_dbg(dev, "Dropping NETIF_F_GRO_HW since no RXCSUM feature.\n");
> > features &= ~NETIF_F_GRO_HW;
> > }
> > }
> >
> > Since kvmtool does not advertise the VIRTIO_NET_F_GUEST_CSUM feature the
> > NETIF_F_RXCSUM bit is not present and so the NETIF_F_GRO_HW bit is
> > cleared. This results in the netdev features changing, which triggers
> > the attempt to reprogram the virtio-net offloads which then fails.
> >
> > This commit prevents that set of netdev features from changing by
> > preemptively applying the same validation and only setting
> > NETIF_F_GRO_HW if NETIF_F_RXCSUM is set because the device supports both
> > VIRTIO_NET_F_GUEST_CSUM and VIRTIO_NET_F_GUEST_TSO{4,6}
> >
> > Signed-off-by: Rob Bradford <rbradford@xxxxxxxxxxxx>
> > ---
> > Changes in v3:
> > - Identified root-cause of feature bit changing and updated conditions
> > check
> > - Link to v2: https://lore.kernel.org/r/20230223-virtio-net-kvmtool-v2-1-8ec93511e67f@xxxxxxxxxxxx
> >
> > Changes in v2:
> > - Use parentheses to group logical OR of features
> > - Link to v1:
> > https://lore.kernel.org/r/20230223-virtio-net-kvmtool-v1-1-fc23d29b9d7a@xxxxxxxxxxxx
> > ---
> > drivers/net/virtio_net.c | 10 ++++++----
> > 1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 61e33e4dd0cd..2e7705142ca5 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -3778,11 +3778,13 @@ static int virtnet_probe(struct virtio_device *vdev)
> > dev->features |= dev->hw_features & NETIF_F_ALL_TSO;
> > /* (!csum && gso) case will be fixed by register_netdev() */
> > }
> > - if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_CSUM))
> > + if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_CSUM)) {
> > dev->features |= NETIF_F_RXCSUM;
> > - if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4) ||
> > - virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO6))
> > - dev->features |= NETIF_F_GRO_HW;
> > + /* This dependency is enforced by netdev_fix_features */
> > + if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4) ||
> > + virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO6))
> > + dev->features |= NETIF_F_GRO_HW;
> > + }
> > if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS))
> > dev->hw_features |= NETIF_F_GRO_HW;

Should we move this also under the check of RXCSUM, otherwise we may
end up the following case:

when CSUM is not negotiated but GUEST_OFFLOADS, can still try to
enable-or-disable guest offloads? Or do we need to fail the probe in
the case via virtnet_validate_features()?

> >
>
> I see. It is annoying that we are duplicating the logic from
> netdev_fix_features here though :(
> Maybe we should call netdev_update_features, in the callback check
> the flags and decide what to set and what to clear?
> Or export netdev_fix_features to modules?

There's a ndo_fix_features() that might be used here.

>
>
>
> Also re-reading Documentation/networking/netdev-features.rst -
>
> 1. netdev->hw_features set contains features whose state may possibly
> be changed (enabled or disabled) for a particular device by user's
> request. This set should be initialized in ndo_init callback and not
> changed later.
>
> 2. netdev->features set contains features which are currently enabled
> for a device. This should be changed only by network core or in
> error paths of ndo_set_features callback.
>
>
> is it then wrong that virtio sets NETIF_F_RXCSUM and NETIF_F_GRO_HW in
> dev->features and not in dev->hw_features?

Looks not the core can try to enable and disable features according to
the diff between features and hw_features

static inline netdev_features_t netdev_get_wanted_features(
struct net_device *dev)
{
return (dev->features & ~dev->hw_features) | dev->wanted_features;
}

Thanks

> We set it there because
> without ctrl guest offload these can not be changed.
> I suspect this is just a minor documentation bug yes? Maybe devices
> where features can't be cleared are uncommon.
>
> Also:
> if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS))
> dev->hw_features |= NETIF_F_GRO_HW;
>
> but should we not set NETIF_F_RXCSUM there too?
>
>
>
> > ---
> > base-commit: c39cea6f38eefe356d64d0bc1e1f2267e282cdd3
> > change-id: 20230223-virtio-net-kvmtool-87f37515be22
> >
> > Best regards,
> > --
> > Rob Bradford <rbradford@xxxxxxxxxxxx>
>