Re: [PATCH] virtio_pci: Wait for legacy device to be reset

From: Michael S. Tsirkin
Date: Tue Apr 11 2023 - 02:45:11 EST


On Tue, Apr 11, 2023 at 02:39:34PM +0800, Jason Wang wrote:
> On Tue, Apr 11, 2023 at 2:36 PM Angus Chen <angus.chen@xxxxxxxxxxxxxxx> wrote:
> >
> > Hi.
> >
> > > -----Original Message-----
> > > From: Jason Wang <jasowang@xxxxxxxxxx>
> > > Sent: Tuesday, April 11, 2023 1:24 PM
> > > To: Angus Chen <angus.chen@xxxxxxxxxxxxxxx>
> > > Cc: mst@xxxxxxxxxx; virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx;
> > > linux-kernel@xxxxxxxxxxxxxxx
> > > Subject: Re: [PATCH] virtio_pci: Wait for legacy device to be reset
> > >
> > > On Tue, Apr 11, 2023 at 9:39 AM Angus Chen <angus.chen@xxxxxxxxxxxxxxx>
> > > wrote:
> > > >
> > > > We read the status of device after reset,
> > > > It is not guaranteed that the device be reseted successfully.
> > > > We can use a while loop to make sure that,like the modern device did.
> > > > The spec is not request it ,but it work.
> > >
> > > The only concern is if it's too late to do this.
> > >
> > > Btw, any reason you want to have a legacy hardware implementation. It
> > > will be very tricky to work correctly.
> > En,I found this in the real production environment some times about one year ago.
> > and I fix this out of tree.our virtio card had been sold about thousands .
> >
> > Now,we created a new card, it support virtio 0.95,1.0,1.1 etc.
> > And we use this host vdpa+ legacy virtio in vm to hot migration,we found that the
> > Legacy model often get the middle state value after reset and probe again.
> > The Soc is Simulated by fpga which is run slower than the host,so the same bug
> > Is found more frequently when the host use the other kernel like ubuntu or centos8.
> >
> > So we hope we can fix this by upstream .
>
> I think you can do mediation in your hypervisor.
>
> When trapping set_status(), the hypervisor will not return until it
> reads 0 from the hardware?
>
> Thanks

Note that for legacy guests, 0 status write is not the only way
to reset the device, writing 0 into pa is another.



> > >
> > > Thanks
> > >
> > > >
> > > > Signed-off-by: Angus Chen <angus.chen@xxxxxxxxxxxxxxx>
> > > > ---
> > > > drivers/virtio/virtio_pci_legacy.c | 4 +++-
> > > > 1 file changed, 3 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
> > > > index 2257f1b3d8ae..f2d241563e4f 100644
> > > > --- a/drivers/virtio/virtio_pci_legacy.c
> > > > +++ b/drivers/virtio/virtio_pci_legacy.c
> > > > @@ -14,6 +14,7 @@
> > > > * Michael S. Tsirkin <mst@xxxxxxxxxx>
> > > > */
> > > >
> > > > +#include <linux/delay.h>
> > > > #include "linux/virtio_pci_legacy.h"
> > > > #include "virtio_pci_common.h"
> > > >
> > > > @@ -97,7 +98,8 @@ static void vp_reset(struct virtio_device *vdev)
> > > > vp_legacy_set_status(&vp_dev->ldev, 0);
> > > > /* Flush out the status write, and flush in device writes,
> > > > * including MSi-X interrupts, if any. */
> > > > - vp_legacy_get_status(&vp_dev->ldev);
> > > > + while (vp_legacy_get_status(&vp_dev->ldev))
> > > > + msleep(1);
> > > > /* Flush pending VQ/configuration callbacks. */
> > > > vp_synchronize_vectors(vdev);
> > > > }
> > > > --
> > > > 2.25.1
> > > >
> >