Re: [PATCH v4 1/2] sched/fair: Check a task has a fitting cpu when updating misfit

From: Vincent Guittot
Date: Thu Jan 25 2024 - 12:40:54 EST


On Wed, 24 Jan 2024 at 23:30, Qais Yousef <qyousef@xxxxxxxxxxx> wrote:
>
> On 01/23/24 09:26, Vincent Guittot wrote:
> > On Fri, 5 Jan 2024 at 23:20, Qais Yousef <qyousef@xxxxxxxxxxx> wrote:
> > >
> > > From: Qais Yousef <qais.yousef@xxxxxxx>
> > >
> > > If a misfit task is affined to a subset of the possible cpus, we need to
> > > verify that one of these cpus can fit it. Otherwise the load balancer
> > > code will continuously trigger needlessly leading the balance_interval
> > > to increase in return and eventually end up with a situation where real
> > > imbalances take a long time to address because of this impossible
> > > imbalance situation.
> >
> > If your problem is about increasing balance_interval, it would be
> > better to not increase the interval is such case.
> > I mean that we are able to detect misfit_task conditions for the
> > periodic load balance so we should be able to not increase the
> > interval in such cases.
> >
> > If I'm not wrong, your problem only happens when the system is
> > overutilized and we have disable EAS
>
> Yes and no. There are two concerns here:
>
> 1.
>
> So this patch is a generalized form of 0ae78eec8aa6 ("sched/eas: Don't update
> misfit status if the task is pinned") which is when I originally noticed the
> problem and this patch was written along side it.
>
> We have unlinked misfit from overutilized since then.
>
> And to be honest I am not sure if flattening of topology matters too since
> I first noticed this, which was on Juno which doesn't have flat topology.
>
> FWIW I can still reproduce this, but I have a different setup now. On M1 mac
> mini if I spawn a busy task affined to littles then expand the mask for
> a single big core; I see big delays (>500ms) without the patch. But with the
> patch it moves in few ms. The delay without the patch is too large and I can't
> explain it. So the worry here is that generally misfit migration not happening
> fast enough due to this fake misfit cases.

I tried a similar scenario on RB5 but I don't see any difference with
your patch. And that could be me not testing it correctly...

I set the affinity of always running task to cpu[0-3] for a few
seconds then extend it to [0-3,7] and the time to migrate is almost
the same.

I'm using tip/sched/core + [0]

[0] https://lore.kernel.org/all/20240108134843.429769-1-vincent.guittot@xxxxxxxxxx/


>
> I did hit issues where with this patch I saw big delays sometimes. I have no
> clue why this happens. So there are potentially more problems to chase.
>
> My expectations that newidle balance should be able to pull misfit regardless
> of balance_interval. So the system has to be really busy or really quite to
> notice delays. I think prior to flat topology this pull was not guaranteed, but
> with flat topology it should happen.
>
> On this system if I expand the mask to all cpus (instead of littles + single
> big), the issue is not as easy to reproduce, but I captured 35+ms delays
> - which is long if this task was carrying important work and needs to
> upmigrate. I thought newidle balance is more likely to pull it sooner, but I am
> not 100% sure.
>
> It's a 6.6 kernel I am testing with.
>
> 2.
>
> Here yes the concern is that when we are overutilized and load balance is
> required, this unnecessarily long delay can cause potential problems.
>
>
> Cheers
>
> --
> Qais Yousef