Re: [virtio-dev] Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero

From: Cornelia Huck
Date: Tue Feb 23 2021 - 05:07:21 EST


On Tue, 23 Feb 2021 17:46:20 +0800
Jason Wang <jasowang@xxxxxxxxxx> wrote:

> On 2021/2/23 下午5:25, Michael S. Tsirkin wrote:
> > On Mon, Feb 22, 2021 at 09:09:28AM -0800, Si-Wei Liu wrote:
> >>
> >> On 2/21/2021 8:14 PM, Jason Wang wrote:
> >>> On 2021/2/19 7:54 下午, Si-Wei Liu wrote:
> >>>> Commit 452639a64ad8 ("vdpa: make sure set_features is invoked
> >>>> for legacy") made an exception for legacy guests to reset
> >>>> features to 0, when config space is accessed before features
> >>>> are set. We should relieve the verify_min_features() check
> >>>> and allow features reset to 0 for this case.
> >>>>
> >>>> It's worth noting that not just legacy guests could access
> >>>> config space before features are set. For instance, when
> >>>> feature VIRTIO_NET_F_MTU is advertised some modern driver
> >>>> will try to access and validate the MTU present in the config
> >>>> space before virtio features are set.
> >>>
> >>> This looks like a spec violation:
> >>>
> >>> "
> >>>
> >>> The following driver-read-only field, mtu only exists if
> >>> VIRTIO_NET_F_MTU is set. This field specifies the maximum MTU for the
> >>> driver to use.
> >>> "
> >>>
> >>> Do we really want to workaround this?
> >> Isn't the commit 452639a64ad8 itself is a workaround for legacy guest?
> >>
> >> I think the point is, since there's legacy guest we'd have to support, this
> >> host side workaround is unavoidable. Although I agree the violating driver
> >> should be fixed (yes, it's in today's upstream kernel which exists for a
> >> while now).
> > Oh you are right:
> >
> >
> > static int virtnet_validate(struct virtio_device *vdev)
> > {
> > if (!vdev->config->get) {
> > dev_err(&vdev->dev, "%s failure: config access disabled\n",
> > __func__);
> > return -EINVAL;
> > }
> >
> > if (!virtnet_validate_features(vdev))
> > return -EINVAL;
> >
> > if (virtio_has_feature(vdev, VIRTIO_NET_F_MTU)) {
> > int mtu = virtio_cread16(vdev,
> > offsetof(struct virtio_net_config,
> > mtu));
> > if (mtu < MIN_MTU)
> > __virtio_clear_bit(vdev, VIRTIO_NET_F_MTU);
>
>
> I wonder why not simply fail here?

I think both failing or not accepting the feature can be argued to make
sense: "the device presented us with a mtu size that does not make
sense" would point to failing, "we cannot work with the mtu size that
the device presented us" would point to not negotiating the feature.

>
>
> > }
> >
> > return 0;
> > }
> >
> > And the spec says:
> >
> >
> > The driver MUST follow this sequence to initialize a device:
> > 1. Reset the device.
> > 2. Set the ACKNOWLEDGE status bit: the guest OS has noticed the device.
> > 3. Set the DRIVER status bit: the guest OS knows how to drive the device.
> > 4. Read device feature bits, and write the subset of feature bits understood by the OS and driver to the
> > device. During this step the driver MAY read (but MUST NOT write) the device-specific configuration
> > fields to check that it can support the device before accepting it.
> > 5. Set the FEATURES_OK status bit. The driver MUST NOT accept new feature bits after this step.
> > 6. Re-read device status to ensure the FEATURES_OK bit is still set: otherwise, the device does not
> > support our subset of features and the device is unusable.
> > 7. Perform device-specific setup, including discovery of virtqueues for the device, optional per-bus setup,
> > reading and possibly writing the device’s virtio configuration space, and population of virtqueues.
> > 8. Set the DRIVER_OK status bit. At this point the device is “live”.
> >
> >
> > Item 4 on the list explicitly allows reading config space before
> > FEATURES_OK.
> >
> > I conclude that VIRTIO_NET_F_MTU is set means "set in device features".
>
>
> So this probably need some clarification. "is set" is used many times in
> the spec that has different implications.

Before FEATURES_OK is set by the driver, I guess it means "the device
has offered the feature"; during normal usage, it means "the feature
has been negotiated". (This is a bit fuzzy for legacy mode.)

Should we add a wording clarification to the spec?