Re: [PATCH v4 2/2] sched/fair: Scan cluster before scanning LLC in wake-up path

From: Yicong Yang
Date: Mon Jun 27 2022 - 04:16:56 EST


On 2022/6/26 20:13, Abel Wu wrote:
>
> On 6/9/22 8:06 PM, Yicong Yang Wrote:
>> From: Barry Song <song.bao.hua@xxxxxxxxxxxxx>
>>
>> For platforms having clusters like Kunpeng920, CPUs within the same cluster
>> have lower latency when synchronizing and accessing shared resources like
>> cache. Thus, this patch tries to find an idle cpu within the cluster of the
>> target CPU before scanning the whole LLC to gain lower latency.
>>
>> Note neither Kunpeng920 nor x86 Jacobsville supports SMT, so this patch
>> doesn't consider SMT for this moment.
>>
>> Testing has been done on Kunpeng920 by pinning tasks to one numa and two
>> numa. On Kunpeng920, Each numa has 8 clusters and each cluster has 4 CPUs.
>>
>> With this patch, We noticed enhancement on tbench within one numa or cross
>> two numa.
>>
>> On numa 0:
>>                              5.19-rc1                patched
>> Hmean     1        350.27 (   0.00%)      406.88 *  16.16%*
>> Hmean     2        702.01 (   0.00%)      808.22 *  15.13%*
>> Hmean     4       1405.14 (   0.00%)     1614.34 *  14.89%*
>> Hmean     8       2830.53 (   0.00%)     3169.02 *  11.96%*
>> Hmean     16      5597.95 (   0.00%)     6224.20 *  11.19%*
>> Hmean     32     10537.38 (   0.00%)    10524.97 *  -0.12%*
>> Hmean     64      8366.04 (   0.00%)     8437.41 *   0.85%*
>> Hmean     128     7060.87 (   0.00%)     7150.25 *   1.27%*
>>
>> On numa 0-1:
>>                              5.19-rc1                patched
>> Hmean     1        346.11 (   0.00%)      408.47 *  18.02%*
>> Hmean     2        693.34 (   0.00%)      805.78 *  16.22%*
>> Hmean     4       1384.96 (   0.00%)     1602.49 *  15.71%*
>> Hmean     8       2699.45 (   0.00%)     3069.98 *  13.73%*
>> Hmean     16      5327.11 (   0.00%)     5688.19 *   6.78%*
>> Hmean     32     10019.10 (   0.00%)    11862.56 *  18.40%*
>> Hmean     64     13850.57 (   0.00%)    17748.54 *  28.14%*
>> Hmean     128    12498.25 (   0.00%)    15541.59 *  24.35%*
>> Hmean     256    11195.77 (   0.00%)    13854.06 *  23.74%*
>>
>> Tested-by: Yicong Yang <yangyicong@xxxxxxxxxxxxx>
>> Signed-off-by: Barry Song <song.bao.hua@xxxxxxxxxxxxx>
>> Signed-off-by: Yicong Yang <yangyicong@xxxxxxxxxxxxx>
>> ---
>>   kernel/sched/fair.c | 44 +++++++++++++++++++++++++++++++++++++++++---
>>   1 file changed, 41 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 77b2048a9326..6d173e196ad3 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -6327,6 +6327,40 @@ static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd
>>     #endif /* CONFIG_SCHED_SMT */
>>   +#ifdef CONFIG_SCHED_CLUSTER
>> +/*
>> + * Scan the cluster domain for idle CPUs and clear cluster cpumask after scanning
>> + */
>> +static inline int scan_cluster(struct task_struct *p, struct cpumask *cpus,
>> +                   int target, int *nr)
>> +{
>> +    struct sched_domain *sd = rcu_dereference(per_cpu(sd_cluster, target));
>> +    int cpu, idle_cpu;
>> +
>> +    /* TODO: Support SMT system with cluster topology */
>> +    if (!sched_smt_active() && sd) {
>> +        for_each_cpu_and(cpu, cpus, sched_domain_span(sd)) {
>> +            if (!--*nr)
>> +                break;
>
> return -1;
> :)
>

thanks for the comment. If we've run out of nr the select_idle_cpu() will stop scanning as well,
so it's unnecessary to clear the mask of cluster cpus and just return.