Re: pinctrl_cherryview: massive error messages (handle_bad_irq)

From: Mika Westerberg
Date: Wed Oct 19 2016 - 15:24:39 EST


On Wed, Oct 19, 2016 at 06:32:57PM +0000, Christian Steiner wrote:
> Hello,
>
> since Kernel 4.8 I see a lot of the following error messages after
> resuming from suspend to RAM:
>
> > irq 117, desc: ffff88017a61e600, depth: 1, count: 0, unhandled: 0
> > ->handle_irq(): ffffffff8109b613, handle_bad_irq+0x0/0x1e0
> > ->irq_data.chip(): ffffffffa0020180, chv_pinctrl_exit+0x2d84/0x12 [pinctrl_cherryview]
> > ->action(): (null)
> > IRQ_NOPROBE set
>
> The messages appear a few thousand times and mess up the log. Despite
> the messages everything seems to work fine.
> I see the messages with Kernel 4.8.2 and 4.9-rc1. The do not appear with
> Kernel 4.7.8 or earlier.

We changed the driver not to mask all IRQs when it starts. See commit
bcb48cca23ec ("pinctrl: cherryview: Do not mask all interrupts in
probe").

> I have an Acer TravelMate B116-M laptop with an Intel Pentium N3700
> processor.
>
> Here are the full log messages of a suspend to RAM and resume cycle with
> Kernel 4.7.8:
>
> > [ 22.605418] PM: Syncing filesystems ... done.
> > [ 22.613640] PM: Preparing system for sleep (mem)
> > [ 22.615108] Freezing user space processes ... (elapsed 0.001 seconds) done.
> > [ 22.616930] Double checking all user space processes after OOM killer disable... (elapsed 0.000 seconds)
> > [ 22.616956] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> > [ 22.618611] PM: Suspending system (mem)
> > [ 22.618640] Suspending console(s) (use no_console_suspend to debug)
> > [ 22.619773] sd 0:0:0:0: [sda] Synchronizing SCSI cache
> > [ 22.626173] sd 0:0:0:0: [sda] Stopping disk
> > [ 23.670076] PM: suspend of devices complete after 1051.008 msecs
> > [ 23.691058] PM: late suspend of devices complete after 20.971 msecs
> > [ 23.692104] r8169 0000:03:00.0: System wakeup enabled by ACPI
> > [ 23.693087] xhci_hcd 0000:00:14.0: System wakeup enabled by ACPI
> > [ 23.711687] PM: noirq suspend of devices complete after 20.624 msecs
> > [ 23.711726] ACPI: Preparing to enter system sleep state S3
> > [ 23.712295] ACPI : EC: EC stopped
> > [ 23.712297] PM: Saving platform NVS memory
> > [ 23.712306] Disabling non-boot CPUs ...
> > [ 23.712590] Cannot set affinity for irq 191
> > [ 23.713647] smpboot: CPU 1 is now offline
> > [ 23.752643] smpboot: CPU 2 is now offline
> > [ 23.802930] smpboot: CPU 3 is now offline
> > [ 23.831827] ACPI: Low-level resume complete
> > [ 23.831938] ACPI : EC: EC started
> > [ 23.831939] PM: Restoring platform NVS memory
> > [ 23.832600] Enabling non-boot CPUs ...
> > [ 23.882819] x86: Booting SMP configuration:
> > [ 23.882821] smpboot: Booting Node 0 Processor 1 APIC 0x2
> > [ 23.885308] cache: parent cpu1 should not be sleeping
> > [ 23.885757] CPU1 is up
> > [ 23.932612] smpboot: Booting Node 0 Processor 2 APIC 0x4
> > [ 23.935171] cache: parent cpu2 should not be sleeping
> > [ 23.935674] CPU2 is up
> > [ 23.982749] smpboot: Booting Node 0 Processor 3 APIC 0x6
> > [ 23.985324] cache: parent cpu3 should not be sleeping
> > [ 23.985840] CPU3 is up
> > [ 23.986370] ACPI: Waking up from system sleep state S3
> > [ 24.012307] xhci_hcd 0000:00:14.0: System wakeup disabled by ACPI
> > [ 24.023417] PM: noirq resume of devices complete after 20.911 msecs
> > [ 25.052523] PM: early resume of devices complete after 1029.024 msecs
> > [ 25.053082] r8169 0000:03:00.0: System wakeup disabled by ACPI
> > [ 25.053124] iwlwifi 0000:02:00.0: RF_KILL bit toggled to disable radio.
> > [ 25.054222] rtc_cmos 00:04: System wakeup disabled by ACPI
> > [ 25.055645] sd 0:0:0:0: [sda] Starting disk
> > [ 25.075031] r8169 0000:03:00.0 eth0: link down
> > [ 25.374179] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> > [ 25.377616] ata1.00: configured for UDMA/133
> > [ 26.025736] PM: resume of devices complete after 973.238 msecs
> > [ 26.026109] PM: Finishing wakeup.
> > [ 26.026111] Restarting tasks ... done.
> > [ 27.767611] r8169 0000:03:00.0 eth0: link up
>
> And this is how it looks with Kernel 4.8.2:
>
> > [ 46.928231] PM: Syncing filesystems ... done.
> > [ 46.930391] PM: Preparing system for sleep (mem)
> > [ 46.932002] Freezing user space processes ... (elapsed 0.001 seconds) done.
> > [ 46.933721] Double checking all user space processes after OOM killer disable... (elapsed 0.000 seconds)
> > [ 46.933749] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> > [ 46.935457] PM: Suspending system (mem)
> > [ 46.935486] Suspending console(s) (use no_console_suspend to debug)
> > [ 46.936492] sd 0:0:0:0: [sda] Synchronizing SCSI cache
> > [ 46.936598] sd 0:0:0:0: [sda] Stopping disk
> > [ 48.026353] PM: suspend of devices complete after 1090.448 msecs
> > [ 48.057844] PM: late suspend of devices complete after 31.480 msecs
> > [ 48.059086] r8169 0000:03:00.0: System wakeup enabled by ACPI
> > [ 48.059240] xhci_hcd 0000:00:14.0: System wakeup enabled by ACPI
> > [ 48.117234] PM: noirq suspend of devices complete after 59.386 msecs
> > [ 48.117269] ACPI: Preparing to enter system sleep state S3
> > [ 48.117836] ACPI : EC: EC stopped
> > [ 48.117837] PM: Saving platform NVS memory
> > [ 48.117846] Disabling non-boot CPUs ...
> > [ 48.118130] Cannot set affinity for irq 191
> > [ 48.119186] smpboot: CPU 1 is now offline
> > [ 48.179126] smpboot: CPU 2 is now offline
> > [ 48.229450] smpboot: CPU 3 is now offline
> > [ 48.278338] ACPI: Low-level resume complete
> > [ 48.278415] ACPI : EC: EC started
> > [ 48.278416] PM: Restoring platform NVS memory
> > [ 48.279036] irq 117, desc: ffff88017a61e600, depth: 1, count: 0, unhandled: 0
> > [ 48.279044] ->handle_irq(): ffffffff8109b613, handle_bad_irq+0x0/0x1e0
> > [ 48.279052] ->irq_data.chip(): ffffffffa0020180, chv_pinctrl_exit+0x2d84/0x12 [pinctrl_cherryview]
> > [ 48.279054] ->action(): (null)
> > [ 48.279055] IRQ_NOPROBE set
> > [ 48.279057] unexpected IRQ trap at vector 75
> >
> [...]
> >
> > [ 48.659033] irq 117, desc: ffff88017a61e600, depth: 1, count: 0, unhandled: 0
> > [ 48.659037] ->handle_irq(): ffffffff8109b613, handle_bad_irq+0x0/0x1e0
> > [ 48.659041] ->irq_data.chip(): ffffffffa0020180, chv_pinctrl_exit+0x2d84/0x12 [pinctrl_cherryview]
> > [ 48.659042] ->action(): (null)
> > [ 48.659043] IRQ_NOPROBE set
> > [ 49.698698] PM: early resume of devices complete after 1089.068 msecs
> > [ 49.699601] r8169 0000:03:00.0: System wakeup disabled by ACPI
> > [ 49.699633] iwlwifi 0000:02:00.0: RF_KILL bit toggled to disable radio.
> > [ 49.708346] sd 0:0:0:0: [sda] Starting disk
> > [ 49.708581] rtc_cmos 00:04: System wakeup disabled by ACPI
> > [ 49.731492] r8169 0000:03:00.0 eth0: link down
> > [ 50.039718] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> > [ 50.041544] ata1.00: configured for UDMA/133
> > [ 50.381833] PM: resume of devices complete after 683.151 msecs
> > [ 50.385539] PM: Finishing wakeup.
> > [ 50.385541] Restarting tasks ... done.
>
> Are the messages just informational or is this something serious that
> should be fixed?

Basically it means the machine has an interrupt line active an nothing
is handling it. It is not serious but rather annoying.

I've seen this happening on proto hardware but it was fixed with
firmware update.

> If you need more information or if I should test a patch, please let me
> know.

Can you send me output of /proc/interrupts and
/sys/kernel/debug/pinctrl/INT33FF:*/pins.