Re: [PATCH v2] vsock/virtio: Remove queued_replies pushback logic

From: Stefano Garzarella
Date: Wed Apr 02 2025 - 09:27:33 EST


On Wed, Apr 02, 2025 at 10:26:05AM +0100, Simon Horman wrote:
On Tue, Apr 01, 2025 at 08:13:49PM +0000, Alexander Graf wrote:
Ever since the introduction of the virtio vsock driver, it included
pushback logic that blocks it from taking any new RX packets until the
TX queue backlog becomes shallower than the virtqueue size.

This logic works fine when you connect a user space application on the
hypervisor with a virtio-vsock target, because the guest will stop
receiving data until the host pulled all outstanding data from the VM.

With Nitro Enclaves however, we connect 2 VMs directly via vsock:

Parent Enclave

RX -------- TX
TX -------- RX

This means we now have 2 virtio-vsock backends that both have the pushback
logic. If the parent's TX queue runs full at the same time as the
Enclave's, both virtio-vsock drivers fall into the pushback path and
no longer accept RX traffic. However, that RX traffic is TX traffic on
the other side which blocks that driver from making any forward
progress. We're now in a deadlock.

To resolve this, let's remove that pushback logic altogether and rely on
higher levels (like credits) to ensure we do not consume unbounded
memory.

RX and TX queues share the same work queue. To prevent starvation of TX
by an RX flood and vice versa now that the pushback logic is gone, let's
deliberately reschedule RX and TX work after a fixed threshold (256) of
packets to process.

Fixes: 0ea9e1d3a9e3 ("VSOCK: Introduce virtio_transport.ko")
Signed-off-by: Alexander Graf <graf@xxxxxxxxxx>
---
net/vmw_vsock/virtio_transport.c | 70 +++++++++-----------------------
1 file changed, 19 insertions(+), 51 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c

...

@@ -158,7 +162,7 @@ virtio_transport_send_pkt_work(struct work_struct *work)
container_of(work, struct virtio_vsock, send_pkt_work);
struct virtqueue *vq;
bool added = false;
- bool restart_rx = false;
+ int pkts = 0;

mutex_lock(&vsock->tx_lock);

@@ -172,6 +176,12 @@ virtio_transport_send_pkt_work(struct work_struct *work)
bool reply;
int ret;

+ if (++pkts > VSOCK_MAX_PKTS_PER_WORK) {
+ /* Allow other works on the same queue to run */
+ queue_work(virtio_vsock_workqueue, work);
+ break;
+ }
+
skb = virtio_vsock_skb_dequeue(&vsock->send_pkt_queue);
if (!skb)
break;

Hi Alexander,

The next non-blank line of code looks like this:

reply = virtio_vsock_skb_reply(skb);

But with this patch reply is assigned but otherwise unused.

Thanks for the report!

So perhaps the line above, and the declaration of reply, can be removed?

@Alex: yes, please remove it.

A part of that the rest LGTM!

I've been running some tests for a while and everything seems okay.

I guess we can do something similar also in vhost-vsock, where we already have "vhost weight" support. IIUC it was added later by commit e79b431fb901 ("vhost: vsock: add weight support"), but we never removed "queued_replies" stuff, that IMO after that commit is pretty much useless.

I'm not asking to that in this series, if you don't have time I can do it separately ;-)

Thanks,
Stefano


Flagged by W=1 builds.

@@ -184,17 +194,6 @@ virtio_transport_send_pkt_work(struct work_struct *work)
break;
}

- if (reply) {
- struct virtqueue *rx_vq = vsock->vqs[VSOCK_VQ_RX];
- int val;
-
- val = atomic_dec_return(&vsock->queued_replies);
-
- /* Do we now have resources to resume rx processing? */
- if (val + 1 == virtqueue_get_vring_size(rx_vq))
- restart_rx = true;
- }
-
added = true;
}

@@ -203,9 +202,6 @@ virtio_transport_send_pkt_work(struct work_struct *work)

out:
mutex_unlock(&vsock->tx_lock);
-
- if (restart_rx)
- queue_work(virtio_vsock_workqueue, &vsock->rx_work);
}

/* Caller need to hold RCU for vsock.

...