Re: [PATCH v2 6/8] sched/idle: Move busy_cpu accounting to idle callback

From: Srikar Dronamraju
Date: Tue May 18 2021 - 00:00:52 EST


* Aubrey Li <aubrey.li@xxxxxxxxxxxxxxx> [2021-05-18 08:59:00]:

> On 5/17/21 8:57 PM, Srikar Dronamraju wrote:
> > * Aubrey Li <aubrey.li@xxxxxxxxxxxxxxx> [2021-05-17 20:48:46]:
> >
> >> On 5/17/21 6:40 PM, Srikar Dronamraju wrote:
> >>> * Aubrey Li <aubrey.li@xxxxxxxxxxxxxxx> [2021-05-14 12:11:50]:
> >>>
> >>>> On 5/13/21 3:31 PM, Srikar Dronamraju wrote:
> >>>>> * Aubrey Li <aubrey.li@xxxxxxxxxxxxxxx> [2021-05-12 16:08:24]:
> >>>>>> On 5/7/21 12:45 AM, Srikar Dronamraju wrote:
> >>>
> >>> <snip>
> >>>
> >>>>>> Also, for those frequent context-switching tasks with very short idle,
> >>>>>> it's expensive for scheduler to mark idle/busy every time, that's why
> >>>>>> my patch only marks idle every time and marks busy ratelimited in
> >>>>>> scheduler tick.
> >>>>>>
> >>>>>
> >>>>> I have tried few tasks with very short idle times and updating nr_busy
> >>>>> everytime, doesnt seem to be impacting. Infact, it seems to help in picking
> >>>>> the idler-llc more often.
> >>>>>
> >>>>
> >>>> How many CPUs in your LLC?
> >>>
> >>> I have tried with X86, 48 CPUs, 2 nodes, each having 24 CPUs in LLC
> >>> +
> >>> POWER10, Multiple CPUs with 4 CPUs in LLC
> >>> +
> >>> POWER9, Multiple CPUs with 8 CPUs in LLC
> >>>
> >>>>
> >>>> This is a system with 192 CPUs, 4 nodes and each node has 48 CPUs in LLC
> >>>> domain.
> >>>>
> >>>
> >>> Okay,
> >>>
> >>>> It looks like for netperf both TCP and UDP cases have the notable change
> >>>> under 2 x overcommit, it may be not interesting though.
> >>>>
> >>>>
> >>>
> >>> I believe the extra load on this 24 core LLC could be because we may end up
> >>> trying to set the idle-core, even when there is no idle core available.
> >>>
> >>> If possible, can you please give a try with v3 with the call to
> >>> set_next_idle_core commented out?
> >>>
> >>>
> >>
> >> v3 seems not be applicable on tip/sched/core 915a2bc3c6b7?
> >
> > I had applied on top of 2ea46c6fc9452ac100ad907b051d797225847e33
> > which was tag: sched-core-2021-04-28
> >
> > The only conflict you get on today's tip is Gautham's one line patch.
> > Gautham's patch replaced 'this' with 'target'.
> >
> > The 2nd patch does away with that line
> >
>
> This is v3. It looks like hackbench gets better. And netperf still has
> some notable changes under 2 x overcommit cases.
>

Thanks Aubrey for the results. netperf (2X) case does seem to regress.
I was actually expecting the results to get better with overcommit.
Can you confirm if this was just v3 or with v3 + set_next_idle_core
disabled?

>
> hackbench (48 tasks per group)
> =========
> case load baseline(std%) compare%( std%)
> process-pipe group-1 1.00 ( 4.51) +1.36 ( 4.26)
> process-pipe group-2 1.00 ( 18.73) -9.66 ( 31.15)
> process-pipe group-3 1.00 ( 23.67) +8.52 ( 21.13)
> process-pipe group-4 1.00 ( 14.65) +17.12 ( 25.23)
> process-pipe group-8 1.00 ( 3.11) +16.41 ( 5.94)
> process-sockets group-1 1.00 ( 8.83) +1.53 ( 11.93)
> process-sockets group-2 1.00 ( 5.32) -15.43 ( 7.17)
> process-sockets group-3 1.00 ( 4.79) -4.14 ( 1.90)
> process-sockets group-4 1.00 ( 2.39) +4.37 ( 1.31)
> process-sockets group-8 1.00 ( 0.38) +4.41 ( 0.05)
> threads-pipe group-1 1.00 ( 3.06) -1.57 ( 3.71)
> threads-pipe group-2 1.00 ( 17.41) -2.16 ( 15.29)
> threads-pipe group-3 1.00 ( 17.94) +19.86 ( 13.24)
> threads-pipe group-4 1.00 ( 15.38) +3.71 ( 11.97)
> threads-pipe group-8 1.00 ( 2.72) +13.40 ( 8.43)
> threads-sockets group-1 1.00 ( 8.51) -2.73 ( 17.48)
> threads-sockets group-2 1.00 ( 5.44) -12.04 ( 5.91)
> threads-sockets group-3 1.00 ( 4.38) -5.00 ( 1.48)
> threads-sockets group-4 1.00 ( 1.08) +4.46 ( 1.15)
> threads-sockets group-8 1.00 ( 0.61) +5.12 ( 0.20)
>
> netperf
> =======
> case load baseline(std%) compare%( std%)
> TCP_RR thread-48 1.00 ( 3.79) +4.69 ( 4.03)
> TCP_RR thread-96 1.00 ( 4.98) -6.74 ( 3.59)
> TCP_RR thread-144 1.00 ( 6.04) -2.36 ( 3.57)
> TCP_RR thread-192 1.00 ( 4.97) -0.44 ( 4.89)
> TCP_RR thread-384 1.00 ( 19.87) -19.12 ( 28.99)
> UDP_RR thread-48 1.00 ( 12.54) -2.73 ( 1.59)
> UDP_RR thread-96 1.00 ( 6.51) -6.66 ( 10.42)
> UDP_RR thread-144 1.00 ( 45.41) -3.81 ( 31.37)
> UDP_RR thread-192 1.00 ( 32.06) +3.07 ( 71.89)
> UDP_RR thread-384 1.00 ( 29.57) -21.52 ( 35.50)

--
Thanks and Regards
Srikar Dronamraju