RE: [PATCH v4 0/4] Implement vdpasim stop operation

From: Parav Pandit
Date: Wed Jun 01 2022 - 15:32:09 EST




> From: Eugenio Perez Martin <eperezma@xxxxxxxxxx>
> Sent: Wednesday, June 1, 2022 5:50 AM
>
> On Tue, May 31, 2022 at 10:19 PM Parav Pandit <parav@xxxxxxxxxx> wrote:
> >
> >
> > > From: Jason Wang <jasowang@xxxxxxxxxx>
> > > Sent: Sunday, May 29, 2022 11:39 PM
> > >
> > > On Fri, May 27, 2022 at 6:56 PM Michael S. Tsirkin <mst@xxxxxxxxxx>
> wrote:
> > > >
> > > > On Thu, May 26, 2022 at 12:54:32PM +0000, Parav Pandit wrote:
> > > > >
> > > > >
> > > > > > From: Eugenio Pérez <eperezma@xxxxxxxxxx>
> > > > > > Sent: Thursday, May 26, 2022 8:44 AM
> > > > >
> > > > > > Implement stop operation for vdpa_sim devices, so vhost-vdpa
> > > > > > will offer
> > > > > >
> > > > > > that backend feature and userspace can effectively stop the device.
> > > > > >
> > > > > >
> > > > > >
> > > > > > This is a must before get virtqueue indexes (base) for live
> > > > > > migration,
> > > > > >
> > > > > > since the device could modify them after userland gets them.
> > > > > > There are
> > > > > >
> > > > > > individual ways to perform that action for some devices
> > > > > >
> > > > > > (VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...)
> but
> > > there
> > > > > > was no
> > > > > >
> > > > > > way to perform it for any vhost device (and, in particular, vhost-
> vdpa).
> > > > > >
> > > > > >
> > > > > >
> > > > > > After the return of ioctl with stop != 0, the device MUST
> > > > > > finish any
> > > > > >
> > > > > > pending operations like in flight requests. It must also
> > > > > > preserve all
> > > > > >
> > > > > > the necessary state (the virtqueue vring base plus the
> > > > > > possible device
> > > > > >
> > > > > > specific states) that is required for restoring in the future.
> > > > > > The
> > > > > >
> > > > > > device must not change its configuration after that point.
> > > > > >
> > > > > >
> > > > > >
> > > > > > After the return of ioctl with stop == 0, the device can
> > > > > > continue
> > > > > >
> > > > > > processing buffers as long as typical conditions are met (vq
> > > > > > is enabled,
> > > > > >
> > > > > > DRIVER_OK status bit is enabled, etc).
> > > > >
> > > > > Just to be clear, we are adding vdpa level new ioctl() that
> > > > > doesn’t map to
> > > any mechanism in the virtio spec.
> > > > >
> > > > > Why can't we use this ioctl() to indicate driver to start/stop
> > > > > the device
> > > instead of driving it through the driver_ok?
> > > > > This is in the context of other discussion we had in the LM series.
> > > >
> > > > If there's something in the spec that does this then let's use that.
> > >
> > > Actually, we try to propose a independent feature here:
> > >
> > > https://lists.oasis-open.org/archives/virtio-dev/202111/msg00020.htm
> > > l
> > >
> > This will stop the device for all the operations.
> > Once the device is stopped, its state cannot be queried further as device
> won't respond.
> > It has limited use case.
> > What we need is to stop non admin queue related portion of the device.
> >
>
> Still don't follow this, sorry.
Once a device it stopped its state etc cannot be queried.
if you want to stop and still allow certain operations, a better spec definition is needed that says,

stop A, B, C, but allow D, E, F, G.
A = stop CVQs and save its state somewhere
B = stop data VQs and save it state somewhere
C = stop generic config interrupt

D = query state of multiple VQs
E = query device statistics and other elements/objects in future
F = setup/config/restore certain fields
G = resume the device

>
> Adding the admin vq to the mix, this would stop a device of a device group,
> but not the whole virtqueue group. If the admin VQ is offered by the PF
> (since it's not exposed to the guest), it will continue accepting requests as
> normal. If it's exposed in the VF, I think the best bet is to shadow it, since
> guest and host requests could conflict.
>
> Since this is offered through vdpa, the device backend driver can route it to
> whatever method works better for the hardware. For example, to send an
> admin vq command to the PF. That's why it's important to keep the feature
> as self-contained and orthogonal to others as possible.
>

I replied in other thread to continue there.