Re: [PATCH v2 00/10] sched: Flatten the pick

From: Vincent Guittot

Date: Tue May 19 2026 - 06:39:36 EST

On Mon, 18 May 2026 at 23:12, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Mon, May 18, 2026 at 03:34:51PM +0200, Vincent Guittot wrote:
> > On Wed, 13 May 2026 at 13:35, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >
> > > On Tue, May 12, 2026 at 10:42:33AM +0200, Vincent Guittot wrote:
> > >
> > > > I haven't reviewed the patches yet but I ran some tests with it while
> > > > testing sched latency related changes for short slice wakeup
> > > > preemption. I have some large hackbench regressions with this series
> > > > on HMP system with and without EAS. those figures are unexpected
> > > > because the benchs run on root cfs
> > > >
> > > > One example with hackbench 8 groups thread pipe
> > > > tip/sched/core tip/sched/core +this patchset +this patchset
> > > > slice 2.8ms 16ms 2.8ms 16ms
> > > > dragonboard rb5 with EAS
> > > > 0,748(+/-4,6%) 0,621(+/-3.6%) +17% 1,915(+/-7.9%) -156%
> > > > 0,689(+/- 9.1%) +8%
> > > >
> > > > radxa orion6 HMP without EAS
> > > > 0,588(+/-5.8%) 0,677(+/-5.9%) -15% 1,505(+/-10%) -156%
> > > > 1,071(+/-5.9%) -82%
> > > >
> > > > Increasing the slice partly removes regressions but tis is surprising
> > > > because the bench runs at root cfs and I thought that results will not
> > > > change in such a case
> > >
> > > D'oh :/
> > >
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index e54da4c6c945..77d0e1937f2c 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -9071,7 +9071,7 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
> > > enum preempt_wakeup_action preempt_action = PREEMPT_WAKEUP_PICK;
> > > struct task_struct *donor = rq->donor;
> > > struct sched_entity *nse, *se = &donor->se, *pse = &p->se;
> > > - struct cfs_rq *cfs_rq = task_cfs_rq(donor);
> > > + struct cfs_rq *cfs_rq = &rq->cfs;
> >
> > I tested this patch on top of the series but it doesn't fix the perf
> > regression on rb5
> >
> > hackbench 8 groups thread pipe is still at 1.907(+/-7.6%) with default
> > slice duration
>
> Weird, I can't reproduce anymore with this fixed :/
>
> I'll try more hackbench variants tomorrow I suppose.

I tried several conf :
- HMP with EAS enabled
- HMP without EAS enabled (perf cpufreq gov)
- SMP (only the 4 little cores)

All of them show large regressions with hackbench which are almost
recovered when increasing the slice from 2.8 to 16ms