Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop

From: Jason Wang
Date: Wed Sep 07 2022 - 22:22:01 EST



在 2022/9/7 15:46, Michael S. Tsirkin 写道:
On Wed, Sep 07, 2022 at 09:07:20AM +0200, Paolo Abeni wrote:
On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <pabeni@xxxxxxxxxx> wrote:
On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
Adding cond_resched() to the command waiting loop for a better
co-operation with the scheduler. This allows to give CPU a breath to
run other task(workqueue) instead of busy looping when preemption is
not allowed.

What's more important. This is a must for some vDPA parent to work
since control virtqueue is emulated via a workqueue for those parents.

Fixes: bda324fd037a ("vdpasim: control virtqueue support")
That's a weird commit to fix. so it fixes the simulator?
Yes, since the simulator is using a workqueue to handle control virtueue.
Uhmm... touching a driver for a simulator's sake looks a little weird.
Simulator is not the only one that is using a workqueue (but should be
the first).

I can see that the mlx5 vDPA driver is using a workqueue as well (see
mlx5_vdpa_kick_vq()).

And in the case of VDUSE, it needs to wait for the response from the
userspace, this means cond_resched() is probably a must for the case
like UP.

Additionally, if the bug is vdpasim, I think it's better to try to
solve it there, if possible.

Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
neither needs a process context, so perhaps you could rework it to run
the work_fn() directly from vdpasim_kick_vq(), at least for the control
virtqueue?
It's possible (but require some rework on the simulator core). But
considering we have other similar use cases, it looks better to solve
it in the virtio-net driver.
I see.

Additionally, this may have better behaviour when using for the buggy
hardware (e.g the control virtqueue takes too long to respond). We may
consider switching to use interrupt/sleep in the future (but not
suitable for -net).
Agreed. Possibly a timeout could be useful, too.

Cheers,

Paolo

Hmm timeouts are kind of arbitrary.
regular drivers basically derive them from hardware
behaviour but with a generic driver like virtio it's harder.
I guess we could add timeout as a config field, have
device make a promise to the driver.

Making the wait interruptible seems more reasonable.


Yes, but I think we still need this patch for -net and -stable.

Thanks