Re: [PATCH 1/2] virtio: allow to detach a buffer from the virtqueue

From: Greg Kurz
Date: Sat Jan 20 2018 - 04:58:48 EST


On Fri, 19 Jan 2018 21:49:38 +0200
"Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote:

> On Wed, Dec 20, 2017 at 06:46:42PM +0100, Greg Kurz wrote:
> > It is possible for a device to stop using buffers without pushing them
> > back to the driver. This is the case for example with the 9p virtio
> > device: if the driver flushes an in-flight request, the 9p specification
> > specification [*] mandates the server to "to purge the pending response".
> > The reply to the flush request indicates that the 9p server has stopped
> > using the buffers of the flushed in-flight request. But since the server
> > doesn't push back the associated buffers, they don't go back to the free
> > list. This leads the virtqueue to end up with a single slot to handle all
> > the dialog with the device, ie, serialize all I/Os.
> >
> > This patch hence gives the possibility for device specific code to
> > explicitly detach a given buffer from the used ring and put it back
> > to the free list.
> >
> > [*] http://man.cat-v.org/plan_9/5/flush
> >
> > Signed-off-by: Greg Kurz <groug@xxxxxxxx>
>
> It would be better to just change the server to mark all flushed
> requests as used. Why isn't that an option?
>

It is just because I started to look at this from a 9p client code
perspective. It supports several other transports than virtio, and
has a set of transport hooks. One of this hook is called when the
server is supposed to have cancelled a request. My first thought
was to rely on this 'cancelled' hook, like it is done for the RDMA
and fd-based transport.

But now, I realize I should be able to come up with something simpler
in the virtio case, if I do like you suggest. It would require a change
in the client though, as the current code assumes anything pushed by the
other end contains a 9p message, which would be wrong if QEMU does
something like virtqueue_push(vq, elem, 0).

I'll give a try.

Thanks!

> > ---
> > drivers/virtio/virtio_ring.c | 28 ++++++++++++++++++++++++++++
> > include/linux/virtio.h | 1 +
> > 2 files changed, 29 insertions(+)
> >
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index eb30f3e09a47..886e9d054de3 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -936,6 +936,34 @@ void *virtqueue_detach_unused_buf(struct virtqueue *_vq)
> > }
> > EXPORT_SYMBOL_GPL(virtqueue_detach_unused_buf);
> >
> > +/**
> > + * virtqueue_detach_buf - detach specific buffer
> > + * @vq: the struct virtqueue we're talking about.
> > + *
> > + * Returns NULL or the "data" token handed to virtqueue_add_*().
> > + * This should only be used if the driver really knows the buffer
> > + * isn't needed anymore by the device.
> > + */
> > +void *virtqueue_detach_buf(struct virtqueue *_vq, void *buf)
> > +{
> > + struct vring_virtqueue *vq = to_vvq(_vq);
> > + unsigned int i;
> > +
> > + START_USE(vq);
> > +
> > + for (i = 0; i < vq->vring.num; i++) {
> > + if (vq->desc_state[i].data != buf)
> > + continue;
> > + detach_buf(vq, i, NULL);
> > + END_USE(vq);
> > + return buf;
> > + }
> > +
> > + END_USE(vq);
> > + return NULL;
> > +}
> > +EXPORT_SYMBOL_GPL(virtqueue_detach_buf);
> > +
> > irqreturn_t vring_interrupt(int irq, void *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
>
> Unfortunately the used index gets out of sync with the available index
> then.
>
> E.g. you are breaking the invariant that used == avail means ring empty.
>
> Any chance to preserve this invariant?
>
>
>
> > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > index 988c7355bc22..850158518ce5 100644
> > --- a/include/linux/virtio.h
> > +++ b/include/linux/virtio.h
> > @@ -80,6 +80,7 @@ bool virtqueue_poll(struct virtqueue *vq, unsigned);
> > bool virtqueue_enable_cb_delayed(struct virtqueue *vq);
> >
> > void *virtqueue_detach_unused_buf(struct virtqueue *vq);
> > +void *virtqueue_detach_buf(struct virtqueue *vq, void *buf);
> >
> > unsigned int virtqueue_get_vring_size(struct virtqueue *vq);
> >