Re: [PATCH v4 03/11] virtio-vdpa: Support interrupt affinity spreading mechanism

From: Yongji Xie
Date: Tue Mar 28 2023 - 00:05:31 EST


On Tue, Mar 28, 2023 at 11:44 AM Jason Wang <jasowang@xxxxxxxxxx> wrote:
>
> On Tue, Mar 28, 2023 at 11:33 AM Yongji Xie <xieyongji@xxxxxxxxxxxxx> wrote:
> >
> > On Tue, Mar 28, 2023 at 11:14 AM Jason Wang <jasowang@xxxxxxxxxx> wrote:
> > >
> > > On Tue, Mar 28, 2023 at 11:03 AM Yongji Xie <xieyongji@xxxxxxxxxxxxx> wrote:
> > > >
> > > > On Fri, Mar 24, 2023 at 2:28 PM Jason Wang <jasowang@xxxxxxxxxx> wrote:
> > > > >
> > > > > On Thu, Mar 23, 2023 at 1:31 PM Xie Yongji <xieyongji@xxxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > To support interrupt affinity spreading mechanism,
> > > > > > this makes use of group_cpus_evenly() to create
> > > > > > an irq callback affinity mask for each virtqueue
> > > > > > of vdpa device. Then we will unify set_vq_affinity
> > > > > > callback to pass the affinity to the vdpa device driver.
> > > > > >
> > > > > > Signed-off-by: Xie Yongji <xieyongji@xxxxxxxxxxxxx>
> > > > >
> > > > > Thinking hard of all the logics, I think I've found something interesting.
> > > > >
> > > > > Commit ad71473d9c437 ("virtio_blk: use virtio IRQ affinity") tries to
> > > > > pass irq_affinity to transport specific find_vqs(). This seems a
> > > > > layer violation since driver has no knowledge of
> > > > >
> > > > > 1) whether or not the callback is based on an IRQ
> > > > > 2) whether or not the device is a PCI or not (the details are hided by
> > > > > the transport driver)
> > > > > 3) how many vectors could be used by a device
> > > > >
> > > > > This means the driver can't actually pass a real affinity masks so the
> > > > > commit passes a zero irq affinity structure as a hint in fact, so the
> > > > > PCI layer can build a default affinity based that groups cpus evenly
> > > > > based on the number of MSI-X vectors (the core logic is the
> > > > > group_cpus_evenly). I think we should fix this by replacing the
> > > > > irq_affinity structure with
> > > > >
> > > > > 1) a boolean like auto_cb_spreading
> > > > >
> > > > > or
> > > > >
> > > > > 2) queue to cpu mapping
> > > > >
> > > >
> > > > But only the driver knows which queues are used in the control path
> > > > which don't need the automatic irq affinity assignment.
> > >
> > > Is this knowledge awarded by the transport driver now?
> > >
> >
> > This knowledge is awarded by the device driver rather than the transport driver.
> >
> > E.g. virtio-scsi uses:
> >
> > struct irq_affinity desc = { .pre_vectors = 2 }; // vq0 is control
> > queue, vq1 is event queue
>
> Ok, but it only works as a hint, it's not a real affinity. As replied,
> we can pass an array of boolean in this case then transport driver
> knows it doesn't need to use automatic affinity for the first two
> queues.
>

But we don't know whether we would use other fields in structure
irq_affinity in the future. So a full set should be better?

> >
> > > E.g virtio-blk uses:
> > >
> > > struct irq_affinity desc = { 0, };
> > >
> > > Atleast we can tell the transport driver which vq requires automatic
> > > irq affinity.
> > >
> >
> > I think that is what the current implementation does.
> >
> > > > So I think the
> > > > irq_affinity structure can only be created by device drivers and
> > > > passed to the virtio-pci/virtio-vdpa driver.
> > >
> > > This could be not easy since the driver doesn't even know how many
> > > interrupts will be used by the transport driver, so it can't built the
> > > actual affinity structure.
> > >
> >
> > The actual affinity mask is built by the transport driver,
>
> For PCI yes, it talks directly to the IRQ subsystems.
>
> > device
> > driver only passes a hint on which queues don't need the automatic irq
> > affinity assignment.
>
> But not for virtio-vDPA since the IRQ needs to be dealt with by the
> parent driver. For our case, it's the VDUSE where it doesn't need IRQ
> at all, a queue to cpu mapping is sufficient.
>

The device driver doesn't know whether it is binded to virtio-pci or
virtio-vdpa. So it should pass a full set needed by the automatic irq
affinity assignment instead of a subset. Then virtio-vdpa can choose
to pass a queue to cpu mapping to VDUSE, which is what we do now (use
set_vq_affinity()).

Thanks,
Yongji