Re: [RFC PATCH] sched/fair: Introduce SIS_PAIR to wakeup task on local idle core first

From: Chen Yu
Date: Mon May 22 2023 - 00:11:33 EST


On 2023-05-18 at 15:56:12 +0530, K Prateek Nayak wrote:
[snip]
> >>
> >> Also wondering if asym_fits_cpu() check is needed in some way here.
> >> Consider a case where waker is on a weaker capacity CPU but wakee
> >> previously ran on a stronger capacity CPU. It might be worthwhile
> >> to wake the wakee on previous CPU if the current CPU does not fit
> >> the task's utilization and move the pair to the CPU with larger
> >> capacity during the next wakeup. wake_affine_weight() would select
> >> a target based on load and capacity consideration but here we
> >> switch the wakeup target to a thread on the current core.
> >>
> >> Wondering if the capacity details already considered in the path?
> >>
> > Good point, I guess what you mean is that, target could be other CPU rather than
> > the current one, there should be a check if the target equals to current CPU.
>
> Yup. That should handle the asymmetric capacity condition too but
> wondering if it makes the case too narrow to see the same benefit.
>
> Can you perhaps try "cpus_share_cache(target, smp_processor_id())"
> instead of a "target == smp_processor_id()"? Since we use similar
> logic to test if p->recent_used_cpu is a good target or not?
>
> This will be equivalent to your current implementation for a single
> socket with one LLC and as for dual socket or multiple LLC case,
> we can be sure "has_idle_core" is indicates the status of MC which
> is shared by both target and current cpu.
>
Right, in this way we can avoid the issue that target and current CPU
are in difference LLCs and has_idle_core does not reflect that.
And asym_fits_cpu() might also be needed to check if the task can fit in.
> > Let me refine the patch and have a test.
> >
>
> I'll hold off queuing a full test run until then.
>
Thank you. I'm also thinking of removing the check of last_wakee,
so there is no much heuristic involved. I'll do some investigation.

Meanwhile, I looked back at Yicong's proposal on waking up task
on local cluster first. It did show some improvement on Jacobsville,
I guess that could also be a chance to reduce C2C latency.

thanks,
Chenyu
> > thanks,
> > Chenyu
> >
> > [..snip..]
> --
> Thanks and Regards,
> Prateek