Re: [PATCH v2] PM: s2idle: Make sure CPUs will wakeup directly on resume

From: Rafael J. Wysocki
Date: Mon Apr 08 2024 - 09:44:58 EST


On Mon, Apr 8, 2024 at 2:43 PM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
>
> On Mon, 8 Apr 2024 at 09:02, Anna-Maria Behnsen
> <anna-maria@xxxxxxxxxxxxx> wrote:
> >
> > s2idle works like a regular suspend with freezing processes and freezing
> > devices. All CPUs except the control CPU go into idle. Once this is
> > completed the control CPU kicks all other CPUs out of idle, so that they
> > reenter the idle loop and then enter s2idle state. The control CPU then
> > issues an swait() on the suspend state and therefore enters the idle loop
> > as well.
> >
> > Due to being kicked out of idle, the other CPUs leave their NOHZ states,
> > which means the tick is active and the corresponding hrtimer is programmed
> > to the next jiffie.
> >
> > On entering s2idle the CPUs shut down their local clockevent device to
> > prevent wakeups. The last CPU which enters s2idle shuts down its local
> > clockevent and freezes timekeeping.
> >
> > On resume, one of the CPUs receives the wakeup interrupt, unfreezes
> > timekeeping and its local clockevent and starts the resume process. At that
> > point all other CPUs are still in s2idle with their clockevents switched
> > off. They only resume when they are kicked by another CPU or after resuming
> > devices and then receiving a device interrupt.
> >
> > That means there is no guarantee that all CPUs will wakeup directly on
> > resume. As a consequence there is no guarantee that timers which are queued
> > on those CPUs and should expire directly after resume, are handled. Also
> > timer list timers which are remotely queued to one of those CPUs after
> > resume will not result in a reprogramming IPI as the tick is
> > active. Queueing a hrtimer will also not result in a reprogramming IPI
> > because the first hrtimer event is already in the past.
> >
> > The recent introduction of the timer pull model (7ee988770326 ("timers:
> > Implement the hierarchical pull model")) amplifies this problem, if the
> > current migrator is one of the non woken up CPUs. When a non pinned timer
> > list timer is queued and the queuing CPU goes idle, it relies on the still
> > suspended migrator CPU to expire the timer which will happen by chance.
> >
> > The problem exists since commit 8d89835b0467 ("PM: suspend: Do not pause
> > cpuidle in the suspend-to-idle path"). There the cpuidle_pause() call which
> > in turn invoked a wakeup for all idle CPUs was moved to a later point in
> > the resume process. This might not be reached or reached very late because
> > it waits on a timer of a still suspended CPU.
> >
> > Address this by kicking all CPUs out of idle after the control CPU returns
> > from swait() so that they resume their timers and restore consistent system
> > state.
> >
> > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218641
> > Fixes: 8d89835b0467 ("PM: suspend: Do not pause cpuidle in the suspend-to-idle path")
> > Signed-off-by: Anna-Maria Behnsen <anna-maria@xxxxxxxxxxxxx>
> > Reviewed-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > Tested-by: Mario Limonciello <mario.limonciello@xxxxxxx>
> > Cc: stable@xxxxxxxxxx
>
> Thanks for the detailed commit message! Please add:
>
> Reviewed-by: Ulf Hansson <ulf.hansson@xxxxxxxxxx>

Applied as 6.9-rc material, many thanks to everyone involved!