Re: [PATCH] sched_ext: Always call put_prev_task() with scx enabled

From: Andrea Righi
Date: Mon Oct 14 2024 - 11:02:12 EST


On Mon, Oct 14, 2024 at 10:36:08AM +0200, Peter Zijlstra wrote:
> On Sun, Oct 13, 2024 at 07:39:28PM +0200, Andrea Righi wrote:
> > With the consolidation of put_prev_task/set_next_task(), we are now
> > skipping the sched_ext ops.stopping/running() transitions when the
> > previous and next tasks are the same, see commit 436f3eed5c69 ("sched:
> > Combine the last put_prev_task() and the first set_next_task()").
> >
> > While this optimization makes sense in general, it can negatively impact
> > performance in some user-space schedulers, that expect to handle such
> > transitions when tasks exhaust their timeslice (see SCX_OPS_ENQ_LAST).
> >
> > For example, scx_rustland suffers a significant performance regression
> > (e.g., gaming benchmarks drop from ~60fps to ~10fps).
> >
> > To fix this, ensure that put_prev_task()/set_next_task() are never
> > skipped when the scx scheduling class is enabled, allowing the scx class
> > to handle such transitions.
> >
> > This change restores the previous behavior, fixing the performance
> > regression in scx_rustland.
> >
> > Link: https://github.com/sched-ext/scx/issues/788
>
> How persistent are links like that? In general I strongly discourage
> links to things not pointing to kernel.org resources.

This one persists also after the issue is marked as resolved, I only
added it to provide more context about the problem. However, the the
commit description already contains all the details, so we can probably
get rid of the link.

>
> > @@ -2523,6 +2508,21 @@ DECLARE_STATIC_KEY_FALSE(__scx_switched_all); /* all fair class tasks on SCX */
> > #define scx_switched_all() false
> > #endif /* !CONFIG_SCHED_CLASS_EXT */
> >
> > +static inline void put_prev_set_next_task(struct rq *rq,
> > + struct task_struct *prev,
> > + struct task_struct *next)
> > +{
> > + WARN_ON_ONCE(rq->curr != prev);
> > +
> > + __put_prev_set_next_dl_server(rq, prev, next);
> > +
> > + if (next == prev && !scx_enabled())
> > + return;
>
> Does that not also want to include a 'next->sched_class ==
> &ext_sched_class' clause ? And a comment?

Good point.

>
> > +
> > + prev->sched_class->put_prev_task(rq, prev, next);
> > + next->sched_class->set_next_task(rq, next, true);
> > +}
>
> And is there really no way scx can infer this happened? We just did pick
> after all, that can see this coming a mile of.

I'll do some testing, we can probably infer this in pick_task_scx() and
make adjustments there, or possibly in scx_update_idle() within the idle
class, since that seems to be where the real issue lies.

Thanks for looking at this,
-Andrea