Re: Linux 4.15-rc2: Regression in resume from ACPI S3
From: Thomas Gleixner
Date: Wed Dec 13 2017 - 10:58:07 EST
So I was finally able to figure out what the hell is going on:
Suspend:
- The device suspend code puts the graphics card into a power
state != PCI_D0.
- Offline non boot CPUs
- Break interrupt affinity. Allocate new vector on CPU 0, compose and
write MSI message which ends up in:
__pci_write_msi_msg(entry, msg)
{
if (dev->current_state != PCI_D0 || pci_dev_is_disconnected(dev)) {
/* Don't touch the hardware now */
} else {
....
}
entry->msg = *msg;
}
So because the device is not in PCI_D0 the message is not written. It's
written in the device resume path.
Resume:
[ 139.670446] ACPI: Low-level resume complete
[ 139.670541] PM: Restoring platform NVS memory
[ 139.672462] do_IRQ: 0.55 No irq handler for vector
[ 139.672475] Enabling non-boot CPUs ...
So the spurious interrupt happens early and way before the device resume
code writes the new MSI message.
I checked the behaviour on 4.14. The MSI write is delayed there in the same
way, but there is no spurious interrupt. There is no interrupt coming in at
all _BEFORE_ the device is put out of PCI_D0.
And this has certainly nothing to do with the vector management changes,
but I can't figure yet what makes that spurious interrupt to be sent.
Any ideas welcome.
Thanks,
tglx