Re: [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation, bisected

From: Borislav Petkov
Date: Fri Aug 01 2014 - 12:11:17 EST


On Fri, Aug 01, 2014 at 04:39:22PM +0200, Borislav Petkov wrote:
> I could try to disable the IOMMU and see whether it still triggers.
> That could tell us something.

Ok, let me summarize what I've been able to observe so far:

* https://lkml.kernel.org/r/1406766807-5745-1-git-send-email-jiang.liu@xxxxxxxxxxxxxxx

this is definitely needed for suspend/resume so for that patch

Acked-and-tested-by: Borislav Petkov <bp@xxxxxxx>

(I need to somehow justify a whole day of bisecting today :-P)

* Then, this
---
diff --git a/drivers/usb/core/hcd-pci.c b/drivers/usb/core/hcd-pci.c
index 82044b5d6113..efc953119ce2 100644
--- a/drivers/usb/core/hcd-pci.c
+++ b/drivers/usb/core/hcd-pci.c
@@ -380,6 +380,8 @@ void usb_hcd_pci_shutdown(struct pci_dev *dev)
if (test_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags) &&
hcd->driver->shutdown) {
hcd->driver->shutdown(hcd);
+ if (usb_hcd_is_primary_hcd(hcd) && hcd->irq > 0)
+ free_irq(hcd->irq, hcd);
pci_disable_device(dev);
}
}
--

is needed for not triggering the remove_proc_entry() WARN_ON.

Finally, even with this hunk above, suspend works fine but I see IOMMU
PFs sometimes(!). Yes, sometimes as in it suspends fine without even
screaming at all and sometimes I get a few of those right before the
machine goes down:

[ 89.040795] pcieport 0000:00:04.0: System wakeup enabled by ACPI
[ 89.061697] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0014 address=0x0000000020001000 flags=0x0000]
[ 89.071871] ACPI: Preparing to enter system sleep state S5
[ 89.072117] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query honored via cmdline
[ 89.089832] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000080 flags=0x0020]
[ 89.102239] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000000 flags=0x0000]
[ 89.114684] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[ 89.127162] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[ 89.139576] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[ 89.152017] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[ 89.164481] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[ 89.176994] AMD-Vi: Event logged [[ 89.177657] reboot: Power down
[ 89.185286] acpi_power_off called

Now this device 00:12.0 is that OHCI thing for which we have the
hcd-pci.c hunk applied above, AFAICT:

00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller

so it must be still some timing issue there after disabling the device
and *before* disabling the IOMMU.

I don't have a clue how to further debug that. Joerg is on CC.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/