Re: [PATCH] x86/hpet: Cure interface abuse in the resume path

From: Thomas Gleixner
Date: Tue Aug 01 2017 - 03:43:40 EST


On Tue, 1 Aug 2017, Tomi Sarvela wrote:
> On 31/07/17 23:07, Thomas Gleixner wrote:
> > The HPET resume path abuses irq_domain_[de]activate_irq() to restore the
> > MSI message in the HPET chip for the boot CPU on resume and it relies on an
> > implementation detail of the interrupt core code, which magically makes the
> > HPET unmask call invoked via a irq_disable/enable pair. This worked as long
> > as the irq code did unconditionally invoke the unmask() callback. With the
> > recent changes which keep track of the masked state to avoid expensive
> > hardware access, this does not longer work. As a consequence the HPET timer
> > interrupts are not unmasked which breaks resume as the boot CPU waits
> > forever that a timer interrupt arrives.
> >
> > Make the restore of the MSI message explicit and invoke the unmask()
> > function directly. While at it get rid of the pointless affinity setting as
> > nothing can change the affinity of the interrupt and the vector across
> > suspend/resume. The restore of the MSI message reestablishes the previous
> > affinity setting which is the correct one.
> >
> > Fixes: bf22ff45bed6 ("genirq: Avoid unnecessary low level irq function
> > calls")
> > Reported-by: Martin Peres <martin.peres@xxxxxxxxxxxxxxx>
> > Reported-by: Tomi Sarvela <tomi.p.sarvela@xxxxxxxxx>
> > Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > Cc: jeffy.chen@xxxxxxxxxxxxxx
> > Cc: Marc Zyngier <marc.zyngier@xxxxxxx>
> > Cc: Peter Ziljstra <peterz@xxxxxxxxxxxxx>
> > Cc: "Rafael J. Wysocki" <rafael.j.wysocki@xxxxxxxxx>
>
> Tested-by: Tomi Sarvela <tomi.p.sarvela@xxxxxxxxx>
>
> Tested only on the regressed Eagle Lake testhost. This patch fixes the
> suspend/resume issue.

Tomi, can you please do me a favor?

Use plain 4.13-rc3 (without that patch) and add the following on the kernel
command line: 'nohpet'. Boot the machine and capture and provide the output
of

# dmesg
# cat /proc/interrupts
# cat /proc/timer_list

Then try the suspend cycle again.

Thanks,

tglx