Re: [PATCH 14/18] virtio: add api for delayed callbacks

From: Rusty Russell
Date: Fri May 20 2011 - 03:53:13 EST

On Thu, 19 May 2011 10:24:12 +0300, "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote:
> On Mon, May 16, 2011 at 04:43:21PM +0930, Rusty Russell wrote:
> > On Sun, 15 May 2011 15:48:18 +0300, "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote:
> > > On Mon, May 09, 2011 at 03:27:33PM +0930, Rusty Russell wrote:
> > > > On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote:
> > > > > Add an API that tells the other side that callbacks
> > > > > should be delayed until a lot of work has been done.
> > > > > Implement using the new used_event feature.
> > > >
> > > > Since you're going to add a capacity query anyway, why not add the
> > > > threshold argument here?
> > >
> > > I thought that if we keep the API kind of generic
> > > there might be more of a chance that future transports
> > > will be able to implement it. For example, with an
> > > old host we can't commit to a specific index.
> >
> > No, it's always a hint anyway: you can be notified before the threshold
> > is reached. But best make it explicit I think.
> >
> > Cheers,
> > Rusty.
> I tried doing that and remembered the real reason I went for this API:
> capacity is limited by descriptor table space, not
> used ring space: each entry in the used ring frees up multiple entries
> in the descriptor ring. Thus the ring can't provide
> callback after capacity is N: capacity is only available
> after we get bufs.

I think this indicates a problem, but I haven't reviewed your patches
except very cursorily. I am slack...

> We could try and make the API pass in the number of freed bufs, however:
> - this is not really what virtio-net cares about (it cares about
> capacity)

Yes, max sg elements and max requests are separate, though in the
current virtio implementation remaining sg <= remaining request slots.

That's why we currently feed back remaining descriptors to the driver,
not the number of request slots.

This implies that the thresholds should refer to descriptor numbers
(ie. wake me when there are this many descriptors freed), not the used
ring at all. Which means we're barking up the wrong tree...

I think this needs more thought.

> - if the driver passes a number > number of outstanding bufs, it will
> never get a callback. So to stay correct the driver will need to
> track number of outstanding requests. The simpler API avoids that.

I think the driver should simply say "wake me when you have this many
descriptors free". And set it during initialization, rather than every
time. The virtio_ring code should handle it from there.

Perhaps that can be done with the current technique, where the
virtio_ring makes an educated guess on when sufficient capacity will be

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at