Re: [PATCH net-next v2 3/5] virtio_ring: add packed ring support

From: Jason Wang
Date: Fri Nov 09 2018 - 05:05:38 EST



On 2018/11/9 äå12:00, Michael S. Tsirkin wrote:
On Fri, Nov 09, 2018 at 10:30:50AM +0800, Jason Wang wrote:
On 2018/11/8 äå11:56, Michael S. Tsirkin wrote:
On Thu, Nov 08, 2018 at 07:51:48PM +0800, Tiwei Bie wrote:
On Thu, Nov 08, 2018 at 04:18:25PM +0800, Jason Wang wrote:
On 2018/11/8 äå9:38, Tiwei Bie wrote:
+
+ if (vq->vq.num_free < descs_used) {
+ pr_debug("Can't add buf len %i - avail = %i\n",
+ descs_used, vq->vq.num_free);
+ /* FIXME: for historical reasons, we force a notify here if
+ * there are outgoing parts to the buffer. Presumably the
+ * host should service the ring ASAP. */
I don't think we have a reason to do this for packed ring.
No historical baggage there, right?
Based on the original commit log, it seems that the notify here
is just an "optimization". But I don't quite understand what does
the "the heuristics which KVM uses" refer to. If it's safe to drop
this in packed ring, I'd like to do it.
According to the commit log, it seems like a workaround of lguest networking
backend.
Do you know why removing this notify in Tx will break "the
heuristics which KVM uses"? Or what does "the heuristics
which KVM uses" refer to?
Yes. QEMU has a mode where it disables notifications and processes TX
ring periodically from a timer. It's off by default but used to be on
by default a long time ago. If ring becomes full this causes traffic
stalls.

Do you mean tx-timer? If yes, we can still enable it for packed ring
Yes we can but I doubt anyone does.

and the
timer will finally fired and we can go.
on tx ring full we probably don't want to wait for timer.
But I think we can just prevent qemu from using tx timer
with virtio 1.


Yes, we can.

Thanks



As a work-around Rusty put in this hack to kick on ring full
even with notifications disabled.

From the commit log it looks more like a performance workaround instead of a
bug fix.
it's a quality of implementation issue, yes.

It's easy enough to make sure QEMU
does not combine devices with packed ring support with the timer hack.
And I am guessing it's safe enough to also block that option completely
e.g. when virtio 1.0 is enabled.

I agree.

Thanks


I agree to drop it, we should not have such burden.

But we should notice that, with this removed, the compare between packed vs
split is kind of unfair. Consider the removal of lguest support recently,
maybe we can drop this for split ring as well?

Thanks


commit 44653eae1407f79dff6f52fcf594ae84cb165ec4
Author: Rusty Russell<rusty@xxxxxxxxxxxxxxx>
Date: Fri Jul 25 12:06:04 2008 -0500

virtio: don't always force a notification when ring is full
We force notification when the ring is full, even if the host has
indicated it doesn't want to know. This seemed like a good idea at
the time: if we fill the transmit ring, we should tell the host
immediately.
Unfortunately this logic also applies to the receiving ring, which is
refilled constantly. We should introduce real notification thesholds
to replace this logic. Meanwhile, removing the logic altogether breaks
the heuristics which KVM uses, so we use a hack: only notify if there are
outgoing parts of the new buffer.
Here are the number of exits with lguest's crappy network implementation:
Before:
network xmit 7859051 recv 236420
After:
network xmit 7858610 recv 118136
Signed-off-by: Rusty Russell<rusty@xxxxxxxxxxxxxxx>

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 72bf8bc09014..21d9a62767af 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -87,8 +87,11 @@ static int vring_add_buf(struct virtqueue *_vq,
if (vq->num_free < out + in) {
pr_debug("Can't add buf len %i - avail = %i\n",
out + in, vq->num_free);
- /* We notify*even if* VRING_USED_F_NO_NOTIFY is set here. */
- vq->notify(&vq->vq);
+ /* FIXME: for historical reasons, we force a notify here if
+ * there are outgoing parts to the buffer. Presumably the
+ * host should service the ring ASAP. */
+ if (out)
+ vq->notify(&vq->vq);
END_USE(vq);
return -ENOSPC;
}