Re: [PATCH] virtio_ring: Fix the stale index in available ring

From: Gavin Shan
Date: Wed Mar 20 2024 - 01:24:42 EST


On 3/20/24 10:49, Michael S. Tsirkin wrote:>
I think you are wasting the time with these tests. Even if it helps what
does this tell us? Try setting a flag as I suggested elsewhere.
Then check it in vhost.
Or here's another idea - possibly easier. Copy the high bits from index
into ring itself. Then vhost can check that head is synchronized with
index.

Warning: completely untested, not even compiled. But should give you
the idea. If this works btw we should consider making this official in
the spec.


static inline int vhost_get_avail_flags(struct vhost_virtqueue *vq,
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 6f7e5010a673..79456706d0bd 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -685,7 +685,8 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
/* Put entry in available array (but don't update avail->idx until they
* do sync). */
avail = vq->split.avail_idx_shadow & (vq->split.vring.num - 1);
- vq->split.vring.avail->ring[avail] = cpu_to_virtio16(_vq->vdev, head);
+ u16 headwithflag = head | (q->split.avail_idx_shadow & ~(vq->split.vring.num - 1));
+ vq->split.vring.avail->ring[avail] = cpu_to_virtio16(_vq->vdev, headwithflag);
/* Descriptors and available array need to be set before we expose the
* new available array entries. */

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 045f666b4f12..bd8f7c763caa 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1299,8 +1299,15 @@ static inline int vhost_get_avail_idx(struct vhost_virtqueue *vq,
static inline int vhost_get_avail_head(struct vhost_virtqueue *vq,
__virtio16 *head, int idx)
{
- return vhost_get_avail(vq, *head,
+ unsigned i = idx;
+ unsigned flag = i & ~(vq->num - 1);
+ unsigned val = vhost_get_avail(vq, *head,
&vq->avail->ring[idx & (vq->num - 1)]);
+ unsigned valflag = val & ~(vq->num - 1);
+
+ WARN_ON(valflag != flag);
+
+ return val & (vq->num - 1);
}

Thanks, Michael. The code is already self-explanatory. Since vq->num is 256, I just
squeezed the last_avail_idx to the high byte. Unfortunately, I'm unable to hit
the WARN_ON(). Does it mean the low byte is stale (or corrupted) while the high
byte is still correct and valid?

avail = vq->split.avail_idx_shadow & (vq->split.vring.num - 1);
vq->split.vring.avail->ring[avail] =
cpu_to_virtio16(_vq->vdev, head | (avail << 8));


head = vhost16_to_cpu(vq, ring_head);
WARN_ON((head >> 8) != (vq->last_avail_idx % vq->num));
head = head & 0xff;

One question: Does QEMU has any chance writing data to the available queue when
vhost is enabled? My previous understanding is no, the queue is totally owned by
vhost instead of QEMU.

Before this patch was posted, I had debugging code to record last 16 transactions
to the available and used queue from guest and host side. It did reveal the wrong
head was fetched from the available queue.

[ 11.785745] ================ virtqueue_get_buf_ctx_split ================
[ 11.786238] virtio_net virtio0: output.0:id 74 is not a head!
[ 11.786655] head to be released: 036 077
[ 11.786952]
[ 11.786952] avail_idx:
[ 11.787234] 000 63985 <--
[ 11.787237] 001 63986
[ 11.787444] 002 63987
[ 11.787632] 003 63988
[ 11.787821] 004 63989
[ 11.788006] 005 63990
[ 11.788194] 006 63991
[ 11.788381] 007 63992
[ 11.788567] 008 63993
[ 11.788772] 009 63994
[ 11.788957] 010 63995
[ 11.789141] 011 63996
[ 11.789327] 012 63997
[ 11.789515] 013 63998
[ 11.789701] 014 63999
[ 11.789886] 015 64000
[ 11.790068]
[ 11.790068] avail_head:
[ 11.790529] 000 075 <--
[ 11.790718] 001 036
[ 11.790890] 002 077
[ 11.791061] 003 129
[ 11.791231] 004 072
[ 11.791400] 005 130
[ 11.791574] 006 015
[ 11.791748] 007 074
[ 11.791918] 008 130
[ 11.792094] 009 130
[ 11.792263] 010 074
[ 11.792437] 011 015
[ 11.792617] 012 072
[ 11.792788] 013 129
[ 11.792961] 014 077 // The last two heads from guest to host: 077, 036
[ 11.793134] 015 036

[root@nvidia-grace-hopper-05 qemu.main]# cat /proc/vhost

avail_idx
000 63998
001 64000
002 63954 <---
003 63955
004 63956
005 63974
006 63981
007 63984
008 63986
009 63987
010 63988
011 63989
012 63992
013 63993
014 63995
015 63997

avail_head
000 074
001 015
002 072
003 129
004 074 // The last two heads seen by vhost is: 074, 036
005 036
006 075 <---
007 036
008 077
009 129
010 072
011 130
012 015
013 074
014 130
015 130

used_idx
000 64000
001 63882 <---
002 63889
003 63891
004 63898
005 63936
006 63942
007 63946
008 63949
009 63953
010 63957
011 63981
012 63990
013 63992
014 63993
015 63999

used_head
000 072
001 129
002 074 // The last two heads published to guest is: 074, 036
003 036
004 075 <---
005 036
006 077
007 129
008 072
009 130
010 015
011 074
012 130
013 130
014 074
015 015

Thanks,
Gavin