[tip: sched/core] sched/cpuidle: Comment about timers requirements VS idle handler

From: tip-bot2 for Frederic Weisbecker
Date: Wed Nov 15 2023 - 04:05:06 EST


The following commit has been merged into the sched/core branch of tip:

Commit-ID: dd5403869a40595eb953f12e8cd2bb57bb88bb67
Gitweb: https://git.kernel.org/tip/dd5403869a40595eb953f12e8cd2bb57bb88bb67
Author: Frederic Weisbecker <frederic@xxxxxxxxxx>
AuthorDate: Tue, 14 Nov 2023 14:38:39 -05:00
Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CommitterDate: Wed, 15 Nov 2023 09:57:51 +01:00

sched/cpuidle: Comment about timers requirements VS idle handler

Add missing explanation concerning IRQs re-enablement constraints in
the cpuidle path against timers.

Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Acked-by: Rafael J. Wysocki <rafael@xxxxxxxxxx>
Link: https://lkml.kernel.org/r/20231114193840.4041-2-frederic@xxxxxxxxxx
---
kernel/sched/idle.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 565f837..3123192 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -258,6 +258,36 @@ static void do_idle(void)
while (!need_resched()) {
rmb();

+ /*
+ * Interrupts shouldn't be re-enabled from that point on until
+ * the CPU sleeping instruction is reached. Otherwise an interrupt
+ * may fire and queue a timer that would be ignored until the CPU
+ * wakes from the sleeping instruction. And testing need_resched()
+ * doesn't tell about pending needed timer reprogram.
+ *
+ * Several cases to consider:
+ *
+ * - SLEEP-UNTIL-PENDING-INTERRUPT based instructions such as
+ * "wfi" or "mwait" are fine because they can be entered with
+ * interrupt disabled.
+ *
+ * - sti;mwait() couple is fine because the interrupts are
+ * re-enabled only upon the execution of mwait, leaving no gap
+ * in-between.
+ *
+ * - ROLLBACK based idle handlers with the sleeping instruction
+ * called with interrupts enabled are NOT fine. In this scheme
+ * when the interrupt detects it has interrupted an idle handler,
+ * it rolls back to its beginning which performs the
+ * need_resched() check before re-executing the sleeping
+ * instruction. This can leak a pending needed timer reprogram.
+ * If such a scheme is really mandatory due to the lack of an
+ * appropriate CPU sleeping instruction, then a FAST-FORWARD
+ * must instead be applied: when the interrupt detects it has
+ * interrupted an idle handler, it must resume to the end of
+ * this idle handler so that the generic idle loop is iterated
+ * again to reprogram the tick.
+ */
local_irq_disable();

if (cpu_is_offline(cpu)) {