Re: [RFC PATCH 1/2] sched: schedule balance map foundation

From: Michael Wang
Date: Mon Jan 14 2013 - 21:33:09 EST


Hi, Namhyung

Thanks for your reply.

On 01/14/2013 04:26 PM, Namhyung Kim wrote:
> Hi Michael,
>

[snip]

>> + while (sd) {
>> + if (sd->flags & SD_LOAD_BALANCE) {
>> + if (sd->flags & SD_BALANCE_EXEC) {
>> + sbm->top_level[SBM_EXEC_TYPE] = sd->level;
>> + sbm->sd[SBM_EXEC_TYPE][sd->level] = sd;
>> + }
>> +
>> + if (sd->flags & SD_BALANCE_FORK) {
>> + sbm->top_level[SBM_FORK_TYPE] = sd->level;
>> + sbm->sd[SBM_FORK_TYPE][sd->level] = sd;
>> + }
>> +
>> + if (sd->flags & SD_BALANCE_WAKE) {
>> + sbm->top_level[SBM_WAKE_TYPE] = sd->level;
>> + sbm->sd[SBM_WAKE_TYPE][sd->level] = sd;
>> + }
>> +
>> + if (sd->flags & SD_WAKE_AFFINE) {
>> + for_each_cpu(i, sched_domain_span(sd)) {
>> + if (!sbm->affine_map[i])
>> + sbm->affine_map[i] = sd;
>> + }
>> + }
>> + }
>> + sd = sd->parent;
>> + }
>
> It seems that it can be done like:
>
> for_each_domain(cpu, sd) {
> if (!(sd->flags & SD_LOAD_BALANCE))
> continue;
>
> if (sd->flags & SD_BALANCE_EXEC)
> ...
> }
>
>

That's right, will correct it.

>> +
>> + /*
>> + * fill the hole to get lower level sd easily.
>> + */
>> + for (type = 0; type < SBM_MAX_TYPE; type++) {
>> + level = sbm->top_level[type];
>> + top_sd = sbm->sd[type][level];
>> + if ((++level != SBM_MAX_LEVEL) && top_sd) {
>> + for (; level < SBM_MAX_LEVEL; level++)
>> + sbm->sd[type][level] = top_sd;
>> + }
>> + }
>> +}
> [snip]
>> +#ifdef CONFIG_SCHED_SMT
>> +#define SBM_MAX_LEVEL 4
>> +#else
>> +#ifdef CONFIG_SCHED_MC
>> +#define SBM_MAX_LEVEL 3
>> +#else
>> +#ifdef CONFIG_SCHED_BOOK
>> +#define SBM_MAX_LEVEL 2
>> +#else
>> +#define SBM_MAX_LEVEL 1
>> +#endif
>> +#endif
>> +#endif
>
> Looks like this fixed level constants does not consider NUMA domains.
> Doesn't accessing sbm->sd[type][level] in the above while loop cause a
> problem on big NUMA machines?

Yes, that's true, this patch is based on 3.7.0-rc6 without NUMA merged,
in order to make the topic a little easier to be started, I will
consider about the NUMA thing in next version, and please let me know if
you have any suggestions.

Regards,
Michael Wang

>
> Thanks,
> Namhyung
>
>> +
>> +enum {
>> + SBM_EXEC_TYPE,
>> + SBM_FORK_TYPE,
>> + SBM_WAKE_TYPE,
>> + SBM_MAX_TYPE
>> +};
>> +
>> +struct sched_balance_map {
>> + struct sched_domain *sd[SBM_MAX_TYPE][SBM_MAX_LEVEL];
>> + int top_level[SBM_MAX_TYPE];
>> + struct sched_domain *affine_map[NR_CPUS];
>> +};
>> +
>> #endif /* CONFIG_SMP */
>>
>> /*
>> @@ -403,6 +430,7 @@ struct rq {
>> #ifdef CONFIG_SMP
>> struct root_domain *rd;
>> struct sched_domain *sd;
>> + struct sched_balance_map *sbm;
>>
>> unsigned long cpu_power;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/