Re: [PATCH V4 8/9] virtio: harden vring IRQ

From: Halil Pasic
Date: Wed May 11 2022 - 08:49:52 EST


On Wed, 11 May 2022 17:27:44 +0800
Jason Wang <jasowang@xxxxxxxxxx> wrote:

> On Wed, May 11, 2022 at 4:44 PM Cornelia Huck <cohuck@xxxxxxxxxx> wrote:
> >
> > On Wed, May 11 2022, Jason Wang <jasowang@xxxxxxxxxx> wrote:
> >
> > > On Tue, May 10, 2022 at 7:32 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
> > >>
> > >> On Sat, May 07, 2022 at 03:19:53PM +0800, Jason Wang wrote:
> > >> > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > >> > index d8a2340f928e..23f1694cdbd5 100644
> > >> > --- a/include/linux/virtio_config.h
> > >> > +++ b/include/linux/virtio_config.h
> > >> > @@ -256,6 +256,18 @@ void virtio_device_ready(struct virtio_device *dev)
> > >> > unsigned status = dev->config->get_status(dev);
> > >> >
> > >> > BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > >> > +
> > >> > + /*
> > >> > + * The virtio_synchronize_cbs() makes sure vring_interrupt()
> > >> > + * will see the driver specific setup if it sees vq->broken
> > >> > + * as false.
> > >> > + */
> > >> > + virtio_synchronize_cbs(dev);
> > >>
> > >> since you mention vq->broken above, maybe add
> > >> "set vq->broken to false"
> > >
> > > Ok.
> > >
> > >>
> > >> > + __virtio_unbreak_device(dev);
> > >> > + /*
> > >> > + * The transport is expected ensure the visibility of
> > >>
> > >> to ensure
> > >
> > > Will fix.
> > >
> > >>
> > >> > + * vq->broken
> > >>
> > >> let's add: "visibility by vq callbacks"
> > >
> > > Sure.
> > >
> > >>
> > >> > before setting VIRTIO_CONFIG_S_DRIVER_OK.
> > >> > + */
> > >>
> > >>
> > >> Can I see some analysis of existing transports showing
> > >> this is actually the case for them?
> > >
> > > Yes.
> > >
> > >> And maybe add a comment near set_status to document the
> > >> requirement.
> > >
> > > For PCI and MMIO, we can quote the memory-barriers.txt or explain that
> > > wmb() is not needed before the MMIO writel().
> > > For CCW, it looks not obvious, it looks to me the IO was submitted via
> > > __ssch() which has an inline assembly. Cornelia and Hali, could you
> > > help me to understand if and how did virtio_ccw_set_status() can
> > > ensure the visibility of the previous driver setup and vq->broken
> > > here?
> >
> > I'm not sure I completely understand the question here, but let me try:
>
> It's something like the following case:
>
> CPU 0: vq->broken = false
> CPU 0: set_status(DRIVER_OK)
> CPU 1: vring_interrupt() { if (vq->broken) return IRQ_NONE; }
>
> We need to make sure the CPU 1 sees the vq->broken if the interrupt is
> raised after DRVER_OK.
>
> For PCI, we use MMIO of writel() for set_status(), a wmb() is not
> needed in this case according to memory-barriers.txt.
>
> "
> Note that, when using writel(), a prior
> wmb() is not needed to guarantee that the cache coherent memory writes
> have completed before writing to the MMIO region.
> "


IMHO the key facts here are the following:
* ssch and all other I/O instructions are serializing instructions
* all interruptions are serializing operations

For reference see
https://www.ibm.com/resources/publications/OutputPubsDetails?PubID=SA22783213
page 5-138.


Maybe we should add that to the linux documentation somewhere if
not already mentioned.

So IMHO we don't need CPU0 to do a wmb() because of the ssch.

>
> So CPU 1 will see the broken as false.

But barriers need to be paired. And in my understanding the ssch
doesn't really ensure that CPU1 is about to see the change, unless
there is a suitable barrier that pairs with the barrier implied
the ssch instruction.

Assumed vring_interrupt() is always done in hard-irq context, AFAIU,
we should be fine. Is that assumption correct?

Why are we fine:
* Either the ssch was performed before the interrupt for
vring_interrupt() got delivered on CPU1, and then we are guaranteed to
see the updated value for vq->broken,
* or the interrupt that triggered vring_interrupt() was delivered before
the ssch instruction got executed. But in this case it is fine to
ignore the notification, because this is actually the bad case
we want to guard against: we got a notification when
notifications are not allowed.

We may end up with !vq->broken and !DEVICE_OK as well, but that should
be fine because, although that notification would be a should not happen
one, I understand it would not catch us with our pants down.

Regards,
Halil


>
> >
> > virtio_ccw_set_status() uses a channel command to set the status, with
> > the interesting stuff done inside ccw_io_helper(). That function
> > - takes the subchannel lock, disabling interrupts
>
> Then it is, for x86 the operation to disable interrupt is a full
> barrier. I guess this should apply to other architecture like s390. I
> see a stnsm is used in this case but a quick google doesn't tell me if
> it's a barrier.
> If this is true. The vring_interrupt will see broken as false.
>
> > - does the ssch; this instruction will fail if there's already another
> > I/O in progress, or an interrupt is pending for the subchannel; on
> > success, it is guaranteed that we'll get an interrupt eventually
>
> I guess ssch might imply a barrier as well, otherwise we may need a
> lot of barriers before this.
>
> Thanks
>
> > - unlock the subchannel, and wait for the interupt handler to eventually
> > process the interrupt, so I guess it should see the vq->broken value?
> >
> > If the I/O fails, virtio_ccw_set_status() will revert its internal
> > status to the old value.
> >
> >
> > >
> > > Thanks
> > >
> > >>
> > >> > dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > >> > }
> >
>