Re: [PATCH 2/3] sched/fair: Generalize misfit lb by adding a misfit reason

From: Xuewen Yan
Date: Mon Jul 29 2024 - 06:47:36 EST


Hi Qais

On Thu, Jul 25, 2024 at 5:35 AM Qais Yousef <qyousef@xxxxxxxxxxx> wrote:
>
> Hi Xuewen
>
> On 07/17/24 16:26, Xuewen Yan wrote:
> > Hi Qais
> >
> > On Sat, Dec 9, 2023 at 9:19 AM Qais Yousef <qyousef@xxxxxxxxxxx> wrote:
>
> > > @@ -11008,6 +11025,7 @@ static struct rq *find_busiest_queue(struct lb_env *env,
> > > * average load.
> > > */
> > > if (env->sd->flags & SD_ASYM_CPUCAPACITY &&
> > > + rq->misfit_reason == MISFIT_PERF &&
> >
> > In Android, I found this would cause a task loop to change the CPUs.
> > Maybe this should be removed. Because for the same capacity cpus, we
> > should skip this cpu when nr_running=1.
>
> Could you explain a bit more? Are you saying this is changing the behavior for
> some use case? The check will ensure this path is only triggered for misfit
> upmigration. Which AFAICT the only reason why this path was added.
>
> The problem is that to implement another misfit reason, the check for
> capacity_greater() is not true except for MISFIT_PERF. For MISFIT_POWER, we
> want the CPU to be smaller.

Sorry, it was my mistake.
After debugging, I found that there was a problem with my handling of
MISFIT_PERF.
But it is true that due to the influence of rt and irq load,
capacity_greater() sometimes does cause some confusion.
Sometimes we find that due to the different capacities between small
cores, a misfit task will migrate several times between small cores,
for example:
If capacity_cpu3 > capacity_cpu2 > capacity_cpu1 >capacity_cpu0,
the misfit task may migrate as follows: cpu0->cpu1->cpu2->cpu3.
I don't know if this migration is really necessary, but it does cause
me some confusion.

Thanks!

>
> I think Vincent is working on a better way to handle all of this now.
>
> >
> > > !capacity_greater(capacity_of(env->dst_cpu), capacity) &&
> > > nr_running == 1)
> > > continue;