Re: [PATCH 2/2] sched/fair: Fix premature check of WAKEUP_PREEMPTION

From: Madadi Vineeth Reddy
Date: Sun Feb 23 2025 - 05:26:14 EST


On 23/02/25 14:14, Abel Wu wrote:
> Hi Madadi,
>
> On 2/23/25 2:16 AM, Madadi Vineeth Reddy Wrote:
>> On 21/02/25 21:27, Abel Wu wrote:
>>> On 2/21/25 7:49 PM, Vincent Guittot Wrote:
>>>> On Fri, 21 Feb 2025 at 12:12, Abel Wu <wuyun.abel@xxxxxxxxxxxxx> wrote:
>>>>>
>>>>> Idle tasks are by definition preempted by non-idle tasks whether feat
>>>>> WAKEUP_PREEMPTION is enabled or not. This isn't true any longer since
>>>>
>>>> I don't think it's true, only "sched_idle never preempts others" is
>>>> always true but sched_feat(WAKEUP_PREEMPTION) is mainly there for
>>>> debug purpose so if WAKEUP_PREEMPTION is false then nobody preempts
>>>> others at wakeup, idle, batch or normal
>>>
>>> Hi Vincent, thanks for your comment!
>>>
>>> The SCHED_IDLE "definition" of being preempted by non-idle tasks comes
>>> from commit 6bc912b71b6f ("sched: SCHED_OTHER vs SCHED_IDLE isolation")
>>> which said:
>>>
>>>      - no SCHED_IDLE buddies
>>>      - never let SCHED_IDLE preempt on wakeup
>>>      - always preempt SCHED_IDLE on wakeup
>>>      - limit SLEEPER fairness for SCHED_IDLE
>>>
>>> and that commit let it be preempted before checking WAKEUP_PREEMPTION.
>>> The rules were introduced in 2009, and to the best of my knowledge there
>>> seemed no behavior change ever since. Please correct me if I missed
>>> anything.
>>
>> As Vincent mentioned, WAKEUP_PREEMPTION is primarily for debugging. Maybe
>> it would help to document that SCHED_IDLE tasks are not preempted by non-idle
>> tasks when WAKEUP_PREEMPTION is disabled. Otherwise, the intent of having no
>> preemptions for debugging would be lost.
>>
>> Thoughts?
>
> I am not sure I really understand the purpose of this debug feature.
> If it wants to provide a way to check whether a performance degrade of
> certain workload is due to overscheduling or not, then do we really
> care about performance of SCHED_IDLE workloads and why?

It's true that we may not be too concerned about performance with
SCHED_IDLE. The issue is preserve the original SCHED_IDLE definition
versus WAKEUP_PREEMPTION, which applies across all policies. Since by
default the feature is true. I am not sure. Either way seems ok to me.

Thanks,
Madadi Vineeth Reddy

>
> IMHO preempting SCHED_IDLE before WAKEUP_PREEMPTION is to preserve the
> IDLE semantics trying to behave like real idle task. It is somehow
> weird to me that we treat sched-idle cpus as idle while don't let the
> non-idle tasks run immediately on sched-idle cpus on debug case.
>
> Thanks,
>     Abel
>