Re: [PATCH v2][RFC] sched/fair: Change SIS_PROP to search idle CPU based on sum of util_avg

From: K Prateek Nayak
Date: Thu Mar 17 2022 - 06:39:57 EST


Hello Chenyu,

Thank you for looking into the results.

On 3/16/2022 5:24 PM, Chen Yu wrote:
> [..snip..]
> Just wonder what the kernel version was when you tested v1?
> https://lore.kernel.org/lkml/4ca9ba48-20f0-84d5-6a38-11f9d4c7a028@xxxxxxx/
> It seems that there is slight performance difference between the old baseline
> and current 5.17-rc5 tip sched/core.
I'll make a point to include the HEAD commit from next time onward to
remove this ambiguity.

- While testing v1, the sched-tip was at:
  commit: 3624ba7b5e2a ("sched/numa-balancing: Move some document to make it consistent with the code")

- While testing v2, the sched-tip was at:
  commit: a0a7e453b502 ("sched/preempt: Tell about PREEMPT_DYNAMIC on kernel headers")
>> [..snip..]
>>
>> ~~~~~~~~
>> schbench
>> ~~~~~~~~
>>
>> NPS 1
>>
>> #workers:        sched-tip               v2_sis_prop
>>   1:      13.00 (0.00 pct)        14.50 (-11.53 pct)
>>   2:      31.50 (0.00 pct)        35.00 (-11.11 pct)
> It seems that in the old result:
> NPS Mode - NPS1
> #workers: sched-tip util-avg
> 1: 13.00 (0.00 pct) 14.50 (-11.53 pct)
> 2: 31.50 (0.00 pct) 34.00 (-7.93 pct)
> we still saw some downgradings. Although in the v1 patch,
> there is no logic change when the utilization is below 85%.
> I'm thinking of this might be deviation when the load is low.
> For example in v2 test of schbench, 3 cycles of testings were
> launched:
> case load baseline(std%) compare%( std%)
> normal 1 mthread group 1.00 ( 17.92) +19.23 ( 23.67)
> The standard deviation ratio is 23%, which seams to be relatively
> large. But consider that v2 patch has changed the logic of how aggressive
> we searching for a idle CPU, even in low utilization, this result
> needs to be evaluated.
We too see a lot of variation for schbench. For two worker case,
following is the data from 10 runs in NPS1 mode:

- sched-tip data

    Min           : 23.00
    Max           : 40.00
    Median        : 31.50
    AMean         : 30.50
    GMean         : 29.87
    HMean         : 29.25
    AMean Stddev  : 6.55
    GMean Stddev  : 6.59
    HMean Stddev  : 6.68
    AMean CoefVar : 21.49 pct
    GMean CoefVar : 22.05 pct
    HMean CoefVar : 22.85 pct

- v2_sis_prop data

    Min           : 22.00
    Max           : 41.00
    Median        : 35.00
    AMean         : 33.50
    GMean         : 32.84
    HMean         : 32.13
    AMean Stddev  : 6.64
    GMean Stddev  : 6.67
    HMean Stddev  : 6.79
    AMean CoefVar : 19.81 pct
    GMean CoefVar : 20.32 pct
    HMean CoefVar : 21.14 pct

The median of the data was reported previously.
> [..snip..]
>> ~~~~~~
>> tbench
>> ~~~~~~
>>
>> NPS 1
>>
>> Clients:          sched-tip              v2_sis_prop
>>     1    477.85 (0.00 pct)       470.68 (-1.50 pct)
>>     2    924.07 (0.00 pct)       910.82 (-1.43 pct)
>>     4    1778.95 (0.00 pct)      1743.64 (-1.98 pct)
>>     8    3244.81 (0.00 pct)      3200.35 (-1.37 pct)
>>    16    5837.06 (0.00 pct)      5808.36 (-0.49 pct)
>>    32    9339.33 (0.00 pct)      8648.03 (-7.40 pct)
>>    64    14761.19 (0.00 pct)     15803.13 (7.05 pct)
>>   128    27806.11 (0.00 pct)     27510.69 (-1.06 pct)
>>   256    35262.03 (0.00 pct)     34135.78 (-3.19 pct)
> The result from v1 patch:
> NPS Mode - NPS1
> Clients: sched-tip util-avg
> 256 26726.29 (0.00 pct) 52502.83 (96.44 pct)
>>   512    52459.78 (0.00 pct)     51630.53 (-1.58 pct)
>>  1024    52480.67 (0.00 pct)     52439.37 (-0.07 pct)
>>
>> NPS 2
>>
>> Clients:          sched-tip              v2_sis_prop
>>     1    478.98 (0.00 pct)       472.98 (-1.25 pct)
>>     2    930.52 (0.00 pct)       914.48 (-1.72 pct)
>>     4    1743.26 (0.00 pct)      1711.16 (-1.84 pct)
>>     8    3297.07 (0.00 pct)      3161.12 (-4.12 pct)
>>    16    5779.10 (0.00 pct)      5738.38 (-0.70 pct)
>>    32    10708.42 (0.00 pct)     10748.26 (0.37 pct)
>>    64    16965.21 (0.00 pct)     16894.42 (-0.41 pct)
>>   128    29152.49 (0.00 pct)     28287.31 (-2.96 pct)
>>   256    27408.75 (0.00 pct)     33680.59 (22.88 pct)
> The result from v1 patch:
> 256 27654.49 (0.00 pct) 47126.18 (70.41 pct)
>>   512    51453.64 (0.00 pct)     47546.87 (-7.59 pct)
>>  1024    52156.85 (0.00 pct)     51233.28 (-1.77 pct)
>>
>> NPS 4
>>
>> Clients:          sched-tip              v2_sis_prop
>>     1    480.29 (0.00 pct)       473.75 (-1.36 pct)
>>     2    940.23 (0.00 pct)       915.60 (-2.61 pct)
>>     4    1760.21 (0.00 pct)      1687.99 (-4.10 pct)
>>     8    3269.75 (0.00 pct)      3154.02 (-3.53 pct)
>>    16    5503.71 (0.00 pct)      5485.01 (-0.33 pct)
>>    32    10633.93 (0.00 pct)     10276.21 (-3.36 pct)
>>    64    16304.44 (0.00 pct)     15351.17 (-5.84 pct)
>>   128    26893.95 (0.00 pct)     25337.08 (-5.78 pct)
>>   256    24469.94 (0.00 pct)     32178.33 (31.50 pct)
> The result from v1 patch:
> 256 25997.38 (0.00 pct) 47735.83 (83.61 pct)
>
> In above three cases, v2 has smaller improvement compared to
> v1. In both v1 and v2, the improvement mainly comes from choosing
> previous running CPU as the target, when the system is busy. But
> v2 is more likely to choose a previous CPU than v1, because its
> threshold 50% is lower than 85% from v2. Then why v2 has less improvement
> than v1? It seems that v2 patch only changes the logic of SIS_PRO for
> single idle CPU searching, but not touches the idle Core searching.
> Meanwhile v1 limits both idle CPU and idle Core searching, and this
> might explain the extra benefit from v1 patch IMO.
Yes, this might be the case.
>> [..snip..]
>> ~~~~~~~~~~~~
>> ycsb-mongodb
>> ~~~~~~~~~~~~
>>
>> NPS1
>>
>> sched-tip:      304934.67 (var: 0.88)
>> v2_sis_prop:    301560.0  (var: 2.0)    (-1.1%)
>>
>> NPS2
>>
>> sched-tip:      303757.0 (var: 1.0)
>> v2_sis_prop:    302283.0 (var: 0.58)    (-0.48%)
>>
>> NPS4
>>
>> sched-tip:      308106.67 (var: 2.88)
>> v2_sis_prop:    302302.67 (var: 1.12)   (-1.88%)
>>
> May I know the average CPU utilization of this benchmark?
I don't have this data at hand. I'll get back to you soon with the data.
> [..snip..]
> I see. But we might have to make this per-LLC search generic, both for smaller
> size and bigger size. Current using exponential descent function could increase the
> number of CPUs to be searched when the system is not busy. I'll think about it
> and do some investigation.
It would indeed be great to have this work well for all LLC sizes.
Thank you for looking into it :)
--
Thanks and Regards,
Prateek