[PATCH v2 28/35] sched: support preempt=full under PREEMPT_AUTO

From: Ankur Arora
Date: Mon May 27 2024 - 20:42:13 EST


The default preemption policy for preempt-full under PREEMPT_AUTO is
to minimize latency, and thus to always schedule eagerly. This is
identical to CONFIG_PREEMPT, and so should result in similar
performance.

Comparing scheduling/IPC workload:

# perf stat -a -e cs --repeat 10 -- perf bench sched messaging -g 20 -t -l 5000

PREEMPT_AUTO, preempt=full

3,080,508 context-switches ( +- 0.64% )
3.65171 +- 0.00654 seconds time elapsed ( +- 0.18% )

PREEMPT_DYNAMIC, preempt=full

3,087,527 context-switches ( +- 0.33% )
3.60163 +- 0.00633 seconds time elapsed ( +- 0.18% )

Looking at the breakup between voluntary and involuntary
context-switches, we see almost identical behaviour as well.

PREEMPT_AUTO, preempt=full

2087910.00 +- 34720.95 voluntary context-switches ( +- 1.660% )
784437.60 +- 19827.79 involuntary context-switches ( +- 2.520% )

PREEMPT_DYNAMIC, preempt=full

2102879.60 +- 22767.11 voluntary context-switches ( +- 1.080% )
801189.90 +- 21324.18 involuntary context-switches ( +- 2.660% )

Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Peter Ziljstra <peterz@xxxxxxxxxxxxx>
Cc: Juri Lelli <juri.lelli@xxxxxxxxxx>
Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Originally-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Link: https://lore.kernel.org/lkml/87jzshhexi.ffs@tglx/
Signed-off-by: Ankur Arora <ankur.a.arora@xxxxxxxxxx>
---
kernel/sched/core.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c3ba33c77053..c25cccc09b65 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1035,9 +1035,10 @@ void wake_up_q(struct wake_q_head *head)
* For preemption models other than PREEMPT_AUTO: always schedule
* eagerly.
*
- * For PREEMPT_AUTO: schedule idle threads eagerly, allow everything else
- * to finish its time quanta, and mark for rescheduling at the next exit
- * to user.
+ * For PREEMPT_AUTO: schedule idle threads eagerly, and under full
+ * preemption, all tasks eagerly. Otherwise, allow everything else
+ * to finish its time quanta, and mark for rescheduling at the next
+ * exit to user.
*/
static resched_t resched_opt_translate(struct task_struct *curr,
enum resched_opt opt)
@@ -1048,6 +1049,9 @@ static resched_t resched_opt_translate(struct task_struct *curr,
if (opt == RESCHED_FORCE)
return RESCHED_NOW;

+ if (preempt_model_preemptible())
+ return RESCHED_NOW;
+
if (is_idle_task(curr))
return RESCHED_NOW;

@@ -8997,7 +9001,9 @@ static void __sched_dynamic_update(int mode)
pr_warn("%s: preempt=full is not recommended with CONFIG_PREEMPT_RCU=n",
PREEMPT_MODE);

- preempt_dynamic_mode = preempt_dynamic_undefined;
+ if (mode != preempt_dynamic_mode)
+ pr_info("%s: full\n", PREEMPT_MODE);
+ preempt_dynamic_mode = mode;
break;
}
}
--
2.31.1