Re: [PATCH v1] s390/virtio_ccw: don't allocate/assign airqs for non-existing queues
From: Halil Pasic
Date: Fri Apr 04 2025 - 00:36:42 EST
On Thu, 3 Apr 2025 16:28:31 +0200
David Hildenbrand <david@xxxxxxxxxx> wrote:
> > Sorry I have to have a look at that discussion. Maybe it will answer
> > some my questions.
>
> Yes, I think so.
>
> >
> >> Let's fix it without affecting existing setups for now by properly
> >> ignoring the non-existing queues, so the indicator bits will match
> >> the queue indexes.
> >
> > Just one question. My understanding is that the crux is that Linux
> > and QEMU (or the driver and the device) disagree at which index
> > reporting_vq is actually sitting. Is that right?
>
> I thought I made it clear: this is only about the airq indicator bit.
> That's where both disagree.
>
> Not the actual queue index (see above).
I did some more research including having a look at that discussion. Let
me try to sum up how did we end up here.
Before commit a229989d975e ("virtio: don't allocate vqs when names[i] =
NULL") the kernel behavior used to be in spec, but QEMU and possibly
other hypervisor were out of spec and things did not work.
Possibly because of the complexity of fixing the hypervisor(s) commit
a229989d975e ("virtio: don't allocate vqs when names[i] = NULL") opted
for changing the guest side so that it does not fit the spec but fits
the hypervisor(s). It unfortunately also broke notifiers (for the with
holes) scenario for virtio-ccw only.
Now we had another look at this, and have concluded that fixing the
hypervisor(s) and fixing the kernel, and making sure that the fixed
kernel can tolerate the old broken hypervisor(s) is way to complicated
if possible at all. So we decided to give the spec a reality check and
fix the notifier bit assignment for virtio-ccw which is broken beyond
doubt if we accept that the correct virtqueue index is the one that the
hypervisor(s) use and not the one that the spec says they should use.
With the spec fixed, the whole notion of "holes" will be something that
does not make sense any more. With that the merit of the kernel interface
virtio_find_vqs() supporting "holes" is quite questionable. Now we need
it because the drivers within the Linux kernel still think of the queues
in terms of the current spec, i.e. they try to have the "holes" as
mandated by the spec, and the duty of making it work with the broken
device implementations falls to the transports.
Under the assumption that the spec is indeed going to be fixed:
Reviewed-by: Halil Pasic <pasic@xxxxxxxxxxxxx>
Regards,
Halil