On Tue, Mar 11, 2025 at 9:18 PM Zhongqiu Han <quic_zhonhan@xxxxxxxxxxx> wrote:
Syzkaller reports a data-race when accessing the event_triggered field of
vring_virtqueue in virtqueue_disable_cb / virtqueue_enable_cb_delayed.
Here is the simplified stack when the issue occurred:
==================================================================
BUG: KCSAN: data-race in virtqueue_disable_cb / virtqueue_enable_cb_delayed
write to 0xffff8881025bc452 of 1 bytes by task 3288 on cpu 0:
virtqueue_enable_cb_delayed+0x42/0x3c0 drivers/virtio/virtio_ring.c:2653
start_xmit+0x230/0x1310 drivers/net/virtio_net.c:3264
__netdev_start_xmit include/linux/netdevice.h:5151 [inline]
netdev_start_xmit include/linux/netdevice.h:5160 [inline]
xmit_one net/core/dev.c:3800 [inline]
dev_hard_start_xmit+0x119/0x3f0 net/core/dev.c:3816
sch_direct_xmit+0x1a9/0x580 net/sched/sch_generic.c:343
__dev_xmit_skb net/core/dev.c:4039 [inline]
__dev_queue_xmit+0xf6a/0x2090 net/core/dev.c:4615
read to 0xffff8881025bc452 of 1 bytes by interrupt on cpu 1:
virtqueue_disable_cb_split drivers/virtio/virtio_ring.c:880 [inline]
virtqueue_disable_cb+0x92/0x180 drivers/virtio/virtio_ring.c:2566
skb_xmit_done+0x5f/0x140 drivers/net/virtio_net.c:777
vring_interrupt+0x161/0x190 drivers/virtio/virtio_ring.c:2715
__handle_irq_event_percpu+0x95/0x490 kernel/irq/handle.c:158
handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
handle_irq_event+0x64/0xf0 kernel/irq/handle.c:210
handle_edge_irq+0x16d/0x5b0 kernel/irq/chip.c:831
generic_handle_irq_desc include/linux/irqdesc.h:173 [inline]
handle_irq arch/x86/kernel/irq.c:249 [inline]
value changed: 0x01 -> 0x00
==================================================================
After an interrupt is triggered, event_triggered can be set to true in the
func vring_interrupt(). Then virtqueue_disable_cb_split() will read it as
true and stop further work of disabling cbs. During this time, if another
virtqueue processing sets same event_triggered to false in func
virtqueue_enable_cb_delayed(), a race condition will occur, potentially
leading to further vq data inconsistency because both
virtqueue_disable_cb_split() and virtqueue_enable_cb_delayed() can
continue read/write multiple field members of vring_virtqueue.
Fix this by using smp_load_acquire() and smp_store_release().
Additionally, virtqueue_disable_cb_packed() may be called in the same
stack as virtqueue_disable_cb_split() while vq->packed_ring is true in
func virtqueue_disable_cb(), so event_triggered should also be protected
in it.
Reported-by: syzbot+efe683d57990864b8c8e@xxxxxxxxxxxxxxxxxxxxxxxxx
Closes: https://lore.kernel.org/all/67c7761a.050a0220.15b4b9.0018.GAE@xxxxxxxxxx/
Signed-off-by: Zhongqiu Han <quic_zhonhan@xxxxxxxxxxx>
Do we have performance numbers for this change?
Btw event_triggered is just a hint, using barriers seems to be an overkill.
What's more the current implementation is buggy:
1) event_triggered should be only called when event idx is used
2) the assumption of device won't raise the interrupt is not ture,
this is especially obvious in the case of packed ring, when the
wrap_counter warps twice, we could still get an interrupt from the
device. This means when the virtqueue size is 256 we will get 1
unnecessary notification every 512 packets etc.
So I wonder just a data_race() hint would be more than sufficient.
Thanks
---
drivers/virtio/virtio_ring.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index fdd2d2b07b5a..b8ff82730618 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -875,9 +875,10 @@ static void virtqueue_disable_cb_split(struct virtqueue *_vq)
/*
* If device triggered an event already it won't trigger one again:
- * no need to disable.
+ * no need to disable. smp_load_acquire pairs with smp_store_release()
+ * in virtqueue_enable_cb_delayed()
*/
- if (vq->event_triggered)
+ if (smp_load_acquire(&vq->event_triggered))
return;
if (vq->event)
@@ -1802,9 +1803,10 @@ static void virtqueue_disable_cb_packed(struct virtqueue *_vq)
/*
* If device triggered an event already it won't trigger one again:
- * no need to disable.
+ * no need to disable. smp_load_acquire pairs with smp_store_release()
+ * in virtqueue_enable_cb_delayed()
*/
- if (vq->event_triggered)
+ if (smp_load_acquire(&vq->event_triggered))
return;
vq->packed.vring.driver->flags =
@@ -2650,7 +2652,8 @@ bool virtqueue_enable_cb_delayed(struct virtqueue *_vq)
struct vring_virtqueue *vq = to_vvq(_vq);
if (vq->event_triggered)
- vq->event_triggered = false;
+ /* Pairs with smp_load_acquire in virtqueue_disable_cb_split/packed() */
+ smp_store_release(&vq->event_triggered, false);
return vq->packed_ring ? virtqueue_enable_cb_delayed_packed(_vq) :
virtqueue_enable_cb_delayed_split(_vq);
--
2.25.1