Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop

From: Paolo Abeni
Date: Wed Sep 07 2022 - 03:07:32 EST


On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
> On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <pabeni@xxxxxxxxxx> wrote:
> >
> > On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
> > > On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
> > > >
> > > > On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > > > > Adding cond_resched() to the command waiting loop for a better
> > > > > co-operation with the scheduler. This allows to give CPU a breath to
> > > > > run other task(workqueue) instead of busy looping when preemption is
> > > > > not allowed.
> > > > >
> > > > > What's more important. This is a must for some vDPA parent to work
> > > > > since control virtqueue is emulated via a workqueue for those parents.
> > > > >
> > > > > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> > > >
> > > > That's a weird commit to fix. so it fixes the simulator?
> > >
> > > Yes, since the simulator is using a workqueue to handle control virtueue.
> >
> > Uhmm... touching a driver for a simulator's sake looks a little weird.
>
> Simulator is not the only one that is using a workqueue (but should be
> the first).
>
> I can see that the mlx5 vDPA driver is using a workqueue as well (see
> mlx5_vdpa_kick_vq()).
>
> And in the case of VDUSE, it needs to wait for the response from the
> userspace, this means cond_resched() is probably a must for the case
> like UP.
>
> >
> > Additionally, if the bug is vdpasim, I think it's better to try to
> > solve it there, if possible.
> >
> > Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
> > neither needs a process context, so perhaps you could rework it to run
> > the work_fn() directly from vdpasim_kick_vq(), at least for the control
> > virtqueue?
>
> It's possible (but require some rework on the simulator core). But
> considering we have other similar use cases, it looks better to solve
> it in the virtio-net driver.

I see.

> Additionally, this may have better behaviour when using for the buggy
> hardware (e.g the control virtqueue takes too long to respond). We may
> consider switching to use interrupt/sleep in the future (but not
> suitable for -net).

Agreed. Possibly a timeout could be useful, too.

Cheers,

Paolo