Re: [PATCH] virtio: use virtio_device_ready() in virtio_device_restore()

From: Stefano Garzarella
Date: Wed Mar 23 2022 - 04:04:27 EST


On Wed, Mar 23, 2022 at 11:10:27AM +0800, Jason Wang wrote:
On Tue, Mar 22, 2022 at 10:07 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:

On Tue, Mar 22, 2022 at 12:43:13PM +0100, Stefano Garzarella wrote:
> After waking up a suspended VM, the kernel prints the following trace
> for virtio drivers which do not directly call virtio_device_ready() in
> the .restore:
>
> PM: suspend exit
> irq 22: nobody cared (try booting with the "irqpoll" option)
> Call Trace:
> <IRQ>
> dump_stack_lvl+0x38/0x49
> dump_stack+0x10/0x12
> __report_bad_irq+0x3a/0xaf
> note_interrupt.cold+0xb/0x60
> handle_irq_event+0x71/0x80
> handle_fasteoi_irq+0x95/0x1e0
> __common_interrupt+0x6b/0x110
> common_interrupt+0x63/0xe0
> asm_common_interrupt+0x1e/0x40
> ? __do_softirq+0x75/0x2f3
> irq_exit_rcu+0x93/0xe0
> sysvec_apic_timer_interrupt+0xac/0xd0
> </IRQ>
> <TASK>
> asm_sysvec_apic_timer_interrupt+0x12/0x20
> arch_cpu_idle+0x12/0x20
> default_idle_call+0x39/0xf0
> do_idle+0x1b5/0x210
> cpu_startup_entry+0x20/0x30
> start_secondary+0xf3/0x100
> secondary_startup_64_no_verify+0xc3/0xcb
> </TASK>
> handlers:
> [<000000008f9bac49>] vp_interrupt
> [<000000008f9bac49>] vp_interrupt
> Disabling IRQ #22
>
> This happens because we don't invoke .enable_cbs callback in
> virtio_device_restore(). That callback is used by some transports
> (e.g. virtio-pci) to enable interrupts.
>
> Let's fix it, by calling virtio_device_ready() as we do in
> virtio_dev_probe(). This function calls .enable_cts callback and sets
> DRIVER_OK status bit.
>
> This fix also avoids setting DRIVER_OK twice for those drivers that
> call virtio_device_ready() in the .restore.
>
> Fixes: d50497eb4e55 ("virtio_config: introduce a new .enable_cbs method")
> Signed-off-by: Stefano Garzarella <sgarzare@xxxxxxxxxx>
> ---
>
> I'm not sure about the fixes tag. That one is more generic, but the
> following one I think introduced the issue.
>
> Fixes: 9e35276a5344 ("virtio_pci: harden MSI-X interrupts")

Jason what should we do about this one BTW? Just revert? We have other
issues ...

Let me post a patch to revert it and give it a rework.

Thanks for reverting it.

Should we queue this patch anyway to prevent future issues and avoid setting DRIVER_OK twice?

Please, let me know if I have to resend it by removing the call trace that after the revert should no longer occur.

Thanks,
Stefano