Re: [PATCH AUTOSEL 4.9 8/8] virtio_pci: don't try to use intxif pin is zero
From: Michael S. Tsirkin
Date: Wed Oct 19 2022 - 01:58:47 EST
On Wed, Oct 19, 2022 at 12:27:46AM +0000, Angus Chen wrote:
> Hi sasha
>
> > -----Original Message-----
> > From: Sasha Levin <sashal@xxxxxxxxxx>
> > Sent: Tuesday, October 18, 2022 8:12 AM
> > To: linux-kernel@xxxxxxxxxxxxxxx; stable@xxxxxxxxxxxxxxx
> > Cc: Angus Chen <angus.chen@xxxxxxxxxxxxxxx>; Michael S . Tsirkin
> > <mst@xxxxxxxxxx>; Sasha Levin <sashal@xxxxxxxxxx>; jasowang@xxxxxxxxxx;
> > virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
> > Subject: [PATCH AUTOSEL 4.9 8/8] virtio_pci: don't try to use intxif pin is zero
> >
> > From: Angus Chen <angus.chen@xxxxxxxxxxxxxxx>
> >
> > [ Upstream commit 71491c54eafa318fdd24a1f26a1c82b28e1ac21d ]
> >
> > The background is that we use dpu in cloud computing,the arch is x86,80
> > cores. We will have a lots of virtio devices,like 512 or more.
> > When we probe about 200 virtio_blk devices,it will fail and
> > the stack is printed as follows:
> >
> > [25338.485128] virtio-pci 0000:b3:00.0: virtio_pci: leaving for legacy driver
> > [25338.496174] genirq: Flags mismatch irq 0. 00000080 (virtio418) vs. 00015a00
> > (timer)
> > [25338.503822] CPU: 20 PID: 5431 Comm: kworker/20:0 Kdump: loaded Tainted:
> > G OE --------- - - 4.18.0-305.30.1.el8.x86_64
> > [25338.516403] Hardware name: Inspur NF5280M5/YZMB-00882-10E, BIOS
> > 4.1.21 08/25/2021
> > [25338.523881] Workqueue: events work_for_cpu_fn
> > [25338.528235] Call Trace:
> > [25338.530687] dump_stack+0x5c/0x80
> > [25338.534000] __setup_irq.cold.53+0x7c/0xd3
> > [25338.538098] request_threaded_irq+0xf5/0x160
> > [25338.542371] vp_find_vqs+0xc7/0x190
> > [25338.545866] init_vq+0x17c/0x2e0 [virtio_blk]
> > [25338.550223] ? ncpus_cmp_func+0x10/0x10
> > [25338.554061] virtblk_probe+0xe6/0x8a0 [virtio_blk]
> > [25338.558846] virtio_dev_probe+0x158/0x1f0
> > [25338.562861] really_probe+0x255/0x4a0
> > [25338.566524] ? __driver_attach_async_helper+0x90/0x90
> > [25338.571567] driver_probe_device+0x49/0xc0
> > [25338.575660] bus_for_each_drv+0x79/0xc0
> > [25338.579499] __device_attach+0xdc/0x160
> > [25338.583337] bus_probe_device+0x9d/0xb0
> > [25338.587167] device_add+0x418/0x780
> > [25338.590654] register_virtio_device+0x9e/0xe0
> > [25338.595011] virtio_pci_probe+0xb3/0x140
> > [25338.598941] local_pci_probe+0x41/0x90
> > [25338.602689] work_for_cpu_fn+0x16/0x20
> > [25338.606443] process_one_work+0x1a7/0x360
> > [25338.610456] ? create_worker+0x1a0/0x1a0
> > [25338.614381] worker_thread+0x1cf/0x390
> > [25338.618132] ? create_worker+0x1a0/0x1a0
> > [25338.622051] kthread+0x116/0x130
> > [25338.625283] ? kthread_flush_work_fn+0x10/0x10
> > [25338.629731] ret_from_fork+0x1f/0x40
> > [25338.633395] virtio_blk: probe of virtio418 failed with error -16
> >
> > The log :
> > "genirq: Flags mismatch irq 0. 00000080 (virtio418) vs. 00015a00 (timer)"
> > was printed because of the irq 0 is used by timer exclusive,and when
> > vp_find_vqs call vp_find_vqs_msix and returns false twice (for
> > whatever reason), then it will call vp_find_vqs_intx as a fallback.
> > Because vp_dev->pci_dev->irq is zero, we request irq 0 with
> > flag IRQF_SHARED, and get a backtrace like above.
> >
> > According to PCI spec about "Interrupt Pin" Register (Offset 3Dh):
> > "The Interrupt Pin register is a read-only register that identifies the
> > legacy interrupt Message(s) the Function uses. Valid values are 01h, 02h,
> > 03h, and 04h that map to legacy interrupt Messages for INTA,
> > INTB, INTC, and INTD respectively. A value of 00h indicates that the
> > Function uses no legacy interrupt Message(s)."
> >
> > So if vp_dev->pci_dev->pin is zero, we should not request legacy
> > interrupt.
> >
> > Signed-off-by: Angus Chen <angus.chen@xxxxxxxxxxxxxxx>
> > Suggested-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
> > Message-Id: <20220930000915.548-1-angus.chen@xxxxxxxxxxxxxxx>
> > Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
> > Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
> > ---
> > drivers/virtio/virtio_pci_common.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/virtio/virtio_pci_common.c
> > b/drivers/virtio/virtio_pci_common.c
> > index 37e3ba5dadf6..d634eb926a2f 100644
> > --- a/drivers/virtio/virtio_pci_common.c
> > +++ b/drivers/virtio/virtio_pci_common.c
> > @@ -389,6 +389,9 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned
> > nvqs,
> > true, false);
> > if (!err)
> > return 0;
> > + /* Is there an interrupt pin? If not give up. */
> > + if (!(to_vp_device(vdev)->pci_dev->pin))
> > + return err;
> > /* Finally fall back to regular interrupts. */
> > return vp_try_to_find_vqs(vdev, nvqs, vqs, callbacks, names,
> > false, false);
> > --
> > 2.35.1
>
> the patch 71491c54eafa31 has been fixed by 2145ab513e3b3,
> It is report by Michael Ellerman <mpe@xxxxxxxxxxxxxx> and suggested by linus.
> If it is merged in the stable git repo, I worry about powerpc arch.
> Thans.
Yes, please either pick up both this and the fixup or none, and
same for all other stable trees where this was autoselected.
It looks like autoselection basically picks up everything that
has a Fixes tag in it yes?
--
MST