Re: [PATCH] /sched/core: Fix Unixbench spawn test regression

From: Hagar Hemdan
Date: Thu Mar 13 2025 - 05:21:22 EST


On Wed, Mar 12, 2025 at 03:41:40PM +0100, Dietmar Eggemann wrote:
> On 11/03/2025 17:35, Vincent Guittot wrote:
> > On Mon, 10 Mar 2025 at 16:29, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
> >>
> >> On 10/03/2025 14:59, Vincent Guittot wrote:
> >>> On Thu, 6 Mar 2025 at 17:26, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
> >>>>
> >>>> Hagar reported a 30% drop in UnixBench spawn test with commit
> >>>> eff6c8ce8d4d ("sched/core: Reduce cost of sched_move_task when config
> >>>> autogroup") on a m6g.xlarge AWS EC2 instance with 4 vCPUs and 16 GiB RAM
> >>>> (aarch64) (single level MC sched domain) [1].
> >>>>
> >>>> There is an early bail from sched_move_task() if p->sched_task_group is
> >>>> equal to p's 'cpu cgroup' (sched_get_task_group()). E.g. both are
> >>>> pointing to taskgroup '/user.slice/user-1000.slice/session-1.scope'
> >>>> (Ubuntu '22.04.5 LTS').
> >>>
> >>> Isn't this same use case that has been used by commit eff6c8ce8d4d to
> >>> show the benefit of adding the test if ((group ==
> >>> tsk->sched_task_group) ?
> >>> Adding Wuchi who added the condition
> >>
> >> IMHO, UnixBench spawn reports a performance number according to how many
> >> tasks could be spawned whereas, IIUC, commit eff6c8ce8d4d was reporting
> >> the time spend in sched_move_task().
> >
> > But does not your patch revert the benefits shown in the figures of
> > commit eff6c8ce8d4d ? It skipped sched_move task in do_exit autogroup
> > and you adds it back
>
> Yeah, we do need the PELT update in sched_change_group()
> (task_change_group_fair()) in the do_exit() path to get the 30% score
> back in 'UnixBench spawn'. Even that means we need more time due to this
> in sched_move_task().
>
> I retested this and it turns out that 'group == tsk->sched_task_group'
> is only true when sched_move_task() is called from exit.
>
> So to get the score back for 'UnixBench spawn' we should rather revert
> commit eff6c8ce8d4d.
>
> The analysis in my patch still holds though.
>
> If you guys agree I can send the revert with my analysis in the
> patch-header.
Agree. The follow up commit fa614b4feb5a ("sched: Simplify sched_move_task()")
needs to be reverted as well.