Re: Re: [PATCH 2/2] sched/fair: Fix premature check of WAKEUP_PREEMPTION

From: Abel Wu
Date: Sun Feb 23 2025 - 03:45:23 EST


Hi Madadi,

On 2/23/25 2:16 AM, Madadi Vineeth Reddy Wrote:
On 21/02/25 21:27, Abel Wu wrote:
On 2/21/25 7:49 PM, Vincent Guittot Wrote:
On Fri, 21 Feb 2025 at 12:12, Abel Wu <wuyun.abel@xxxxxxxxxxxxx> wrote:

Idle tasks are by definition preempted by non-idle tasks whether feat
WAKEUP_PREEMPTION is enabled or not. This isn't true any longer since

I don't think it's true, only "sched_idle never preempts others" is
always true but sched_feat(WAKEUP_PREEMPTION) is mainly there for
debug purpose so if WAKEUP_PREEMPTION is false then nobody preempts
others at wakeup, idle, batch or normal

Hi Vincent, thanks for your comment!

The SCHED_IDLE "definition" of being preempted by non-idle tasks comes
from commit 6bc912b71b6f ("sched: SCHED_OTHER vs SCHED_IDLE isolation")
which said:

    - no SCHED_IDLE buddies
    - never let SCHED_IDLE preempt on wakeup
    - always preempt SCHED_IDLE on wakeup
    - limit SLEEPER fairness for SCHED_IDLE

and that commit let it be preempted before checking WAKEUP_PREEMPTION.
The rules were introduced in 2009, and to the best of my knowledge there
seemed no behavior change ever since. Please correct me if I missed
anything.

As Vincent mentioned, WAKEUP_PREEMPTION is primarily for debugging. Maybe
it would help to document that SCHED_IDLE tasks are not preempted by non-idle
tasks when WAKEUP_PREEMPTION is disabled. Otherwise, the intent of having no
preemptions for debugging would be lost.

Thoughts?

I am not sure I really understand the purpose of this debug feature.
If it wants to provide a way to check whether a performance degrade of
certain workload is due to overscheduling or not, then do we really
care about performance of SCHED_IDLE workloads and why?

IMHO preempting SCHED_IDLE before WAKEUP_PREEMPTION is to preserve the
IDLE semantics trying to behave like real idle task. It is somehow
weird to me that we treat sched-idle cpus as idle while don't let the
non-idle tasks run immediately on sched-idle cpus on debug case.

Thanks,
Abel