Re: [PATCH v4 0/7] sched/fair: improve scan efficiency of SIS

From: Abel Wu
Date: Mon Aug 15 2022 - 09:59:52 EST


Hi K Prateek, thanks for your test and sorry for the late reply..

On 7/18/22 7:00 PM, K Prateek Nayak Wrote:
Hello Abel,

We've tested the patch on a dual socket Zen3 System (2 x 64C/128T).

tl;dr

- There is a noticeable regression for Hackbench with the system
configured in NPS4 mode. This regression is more noticeable
with SIS_UTIL enabled and not as severe with SIS_PROP.
This regression is surprising given the patch should have
improved SIS Efficiency in case of fully loaded system and is
consistently reproducible across multiple runs and reboots.

The regression seems unexpected, I will try to reproduce with my
Intel server. While staring at the code, I found something may be
relative to the issue:

- The cpumask_and() in select_idle_cpu() is before SIS_UTIL which
could bail out early. So when SIS filter is enabled, lots of
useless efforts could be made if nr_idle_scan==0 (e.g. 16groups).
While the SIS_PROP case is different, the efforts done by the
filter won't be all in vain, that's probably the reason why the
regression under SIS_UTIL is more noticeable. I am working on a
patch to optimize this.

- If nr_idle_scan == 0 then select_idle_cpu() will bail out early,
so it's pointless to update SIS filter which may further burden
the overhead together with the above issue. This will be fixed
in next version.

I will rework the whole patchset to fit the new SIS_UTIL feature.


- Apart from the above anomaly, the results look positive overall
with the patched kernel behaving as well as, or better than the tip.

Cheers!


[..snip..]

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Hackbench - 15 runs statistics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

o NPS 4 - 16 groups (SIS_UTIL)

- tip

Min : 7.35
Max : 12.66
Median : 10.60
AMean : 10.00
GMean : 9.82
HMean : 9.64
AMean Stddev : 1.88
AMean CoefVar : 18.85 pct

- SIS_Eff

Min : 12.32
Max : 18.92
Median : 13.82
AMean : 14.96 (-49.60 pct)
GMean : 14.80
HMean : 14.66
AMean Stddev : 2.25
AMean CoefVar : 15.01 pct

o NPS 4 - 16 groups (SIS_PROP)

- tip

Min : 7.04
Max : 8.22
Median : 7.49
AMean : 7.52
GMean : 7.52
HMean : 7.51
AMean Stddev : 0.29
AMean CoefVar : 3.88 pct

- SIS_Eff

Min : 7.04
Max : 9.78
Median : 8.16
AMean : 8.42 (-11.06 pct)
GMean : 8.39
HMean : 8.36
AMean Stddev : 0.78
AMean CoefVar : 9.23 pct

The Hackbench regression is much more noticeable with SIS_UTIL
enabled but only when the test machine is running in NPS4 mode.
It is not obvious why this is happening given the patch series
aims at improving SIS Efficiency.

The result seems to get some kind of connection with the LLC size.
I need some time to figure it out.


It would be great if you can test the series with SIS_UTIL
enabled and SIS_PROP disabled to see if it effects any benchmark
behavior given SIS_UTIL is the default SIS logic currently on
the tip.

Yes, I will.

Thanks & Best Regards,
Abel