Re: [PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue

From: Jason Wang
Date: Wed Dec 28 2022 - 01:35:23 EST

Next message: Yonghong Song: "Re: kernel panic: Attempted to kill init!"
Previous message: Biao Huang: "[PATCH v6 2/2] arm64: dts: mt8195: Add Ethernet controller"
In reply to: Michael S. Tsirkin: "Re: [PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue"
Next in thread: Jason Wang: "Re: [PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

在 2022/12/27 17:38, Michael S. Tsirkin 写道:

On Tue, Dec 27, 2022 at 05:12:58PM +0800, Jason Wang wrote:

在 2022/12/27 15:33, Michael S. Tsirkin 写道:

On Tue, Dec 27, 2022 at 12:30:35PM +0800, Jason Wang wrote:

But device is still going and will later use the buffers.

Same for timeout really.

Avoiding infinite wait/poll is one of the goals, another is to sleep.
If we think the timeout is hard, we can start from the wait.

Thanks

If the goal is to avoid disrupting traffic while CVQ is in use,
that sounds more reasonable. E.g. someone is turning on promisc,
a spike in CPU usage might be unwelcome.

Yes, this would be more obvious is UP is used.

things we should be careful to address then:
1- debugging. Currently it's easy to see a warning if CPU is stuck
in a loop for a while, and we also get a backtrace.
E.g. with this - how do we know who has the RTNL?
We need to integrate with kernel/watchdog.c for good results
and to make sure policy is consistent.

That's fine, will consider this.

2- overhead. In a very common scenario when device is in hypervisor,
programming timers etc has a very high overhead, at bootup
lots of CVQ commands are run and slowing boot down is not nice.
let's poll for a bit before waiting?

Then we go back to the question of choosing a good timeout for poll. And
poll seems problematic in the case of UP, scheduler might not have the
chance to run.

Poll just a bit :) Seriously I don't know, but at least check once
after kick.

I think it is what the current code did where the condition will be check before trying to sleep in the wait_event().

3- suprise removal. need to wake up thread in some way. what about
other cases of device breakage - is there a chance this
introduces new bugs around that? at least enumerate them please.

The current code did:

1) check for vq->broken
2) wakeup during BAD_RING()

So we won't end up with a never woke up process which should be fine.

Thanks

BTW BAD_RING on removal will trigger dev_err. Not sure that is a good
idea - can cause crashes if kernel panics on error.

Yes, it's better to use __virtqueue_break() instead.

But consider we will start from a wait first, I will limit the changes in virtio-net without bothering virtio core.

Thanks

Next message: Yonghong Song: "Re: kernel panic: Attempted to kill init!"
Previous message: Biao Huang: "[PATCH v6 2/2] arm64: dts: mt8195: Add Ethernet controller"
In reply to: Michael S. Tsirkin: "Re: [PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue"
Next in thread: Jason Wang: "Re: [PATCH 3/4] virtio_ring: introduce a per virtqueue waitqueue"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]