Re: Re: [PATCH 2/2] sched/fair: Fix premature check of WAKEUP_PREEMPTION

From: Phil Auld
Date: Mon Feb 24 2025 - 09:19:19 EST


On Mon, Feb 24, 2025 at 02:47:13PM +0100 Vincent Guittot wrote:
> On Sun, 23 Feb 2025 at 12:22, Abel Wu <wuyun.abel@xxxxxxxxxxxxx> wrote:
> >
> > On 2/23/25 6:25 PM, Madadi Vineeth Reddy Wrote:
> > > On 23/02/25 14:14, Abel Wu wrote:
> > >> Hi Madadi,
> > >>
> > >> On 2/23/25 2:16 AM, Madadi Vineeth Reddy Wrote:
> > >>> On 21/02/25 21:27, Abel Wu wrote:
> > >>>> On 2/21/25 7:49 PM, Vincent Guittot Wrote:
> > >>>>> On Fri, 21 Feb 2025 at 12:12, Abel Wu <wuyun.abel@xxxxxxxxxxxxx> wrote:
> > >>>>>>
> > >>>>>> Idle tasks are by definition preempted by non-idle tasks whether feat
> > >>>>>> WAKEUP_PREEMPTION is enabled or not. This isn't true any longer since
> > >>>>>
> > >>>>> I don't think it's true, only "sched_idle never preempts others" is
> > >>>>> always true but sched_feat(WAKEUP_PREEMPTION) is mainly there for
> > >>>>> debug purpose so if WAKEUP_PREEMPTION is false then nobody preempts
> > >>>>> others at wakeup, idle, batch or normal
> > >>>>
> > >>>> Hi Vincent, thanks for your comment!
> > >>>>
> > >>>> The SCHED_IDLE "definition" of being preempted by non-idle tasks comes
> > >>>> from commit 6bc912b71b6f ("sched: SCHED_OTHER vs SCHED_IDLE isolation")
> > >>>> which said:
> > >>>>
> > >>>> - no SCHED_IDLE buddies
> > >>>> - never let SCHED_IDLE preempt on wakeup
> > >>>> - always preempt SCHED_IDLE on wakeup
> > >>>> - limit SLEEPER fairness for SCHED_IDLE
> > >>>>
> > >>>> and that commit let it be preempted before checking WAKEUP_PREEMPTION.
> > >>>> The rules were introduced in 2009, and to the best of my knowledge there
> > >>>> seemed no behavior change ever since. Please correct me if I missed
> > >>>> anything.
> > >>>
> > >>> As Vincent mentioned, WAKEUP_PREEMPTION is primarily for debugging. Maybe
> > >>> it would help to document that SCHED_IDLE tasks are not preempted by non-idle
> > >>> tasks when WAKEUP_PREEMPTION is disabled. Otherwise, the intent of having no
> > >>> preemptions for debugging would be lost.
> > >>>
> > >>> Thoughts?
> > >>
> > >> I am not sure I really understand the purpose of this debug feature.
> > >> If it wants to provide a way to check whether a performance degrade of
> > >> certain workload is due to overscheduling or not, then do we really
> > >> care about performance of SCHED_IDLE workloads and why?
> > >
> > > It's true that we may not be too concerned about performance with
> > > SCHED_IDLE. The issue is preserve the original SCHED_IDLE definition
> > > versus WAKEUP_PREEMPTION, which applies across all policies. Since by
> >
> > Yes, exactly.
> >
> > > default the feature is true. I am not sure. Either way seems ok to me.
> >
> > Hi Vincent,
> >
> > Since Peter gave the priority to SCHED_IDLE semantics over WAKEUP_PREEMPTION
> > in his commit 6bc912b71b6f ("sched: SCHED_OTHER vs SCHED_IDLE isolation"),
> > and the choice is kept unchanged for quite a long time until the recent merged
> > commit faa42d29419d ("sched/fair: Make SCHED_IDLE entity be preempted in strict hierarchy")
> > which seemed not intend to change it, shall we restore the choice for now and
> > leave the discussion of the scope of WAKEUP_PREEMPTION to the future once any
> > usecase shows up?
>
> Or we should just remove it. I'm curious to know who used it during
> the last couple of years ? Having in mind that lazy preemption adds
> another level as check_preempt_wakeup_fair() uses it so sched-idle
> tasks might not always be immediately preempted anyway.
>

It can be helpful to be able to turn that off when chasing performance
issues. See the DELAY_DEQUEUE thread from a few months back. In that
case we never got to a good answer, but did use NO_WAKEUP_PREEMPTION
during debugging to take out some variables at least. FWIW.


Cheers,
Phil

>
> >
> > Thanks,
> > Abel
> >
>

--