Re: sched_mc_power_savings broken with CGROUPS+CPUSETS

From: Vaidyanathan Srinivasan
Date: Sat Aug 30 2008 - 16:39:26 EST


* Peter Zijlstra <peterz@xxxxxxxxxxxxx> [2008-08-30 13:26:53]:

[snipped]

>
> I don't think iterating the domains and setting the flag is sufficient.
> Look at this crap (found in arch/x86/kernel/smpboot.c):
>
> cpumask_t cpu_coregroup_map(int cpu)
> {
> struct cpuinfo_x86 *c = &cpu_data(cpu);
> /*
> * For perf, we return last level cache shared map.
> * And for power savings, we return cpu_core_map
> */
> if (sched_mc_power_savings || sched_smt_power_savings)
> return per_cpu(cpu_core_map, cpu);
> else
> return c->llc_shared_map;
> }
>
> which means we'll actually end up building different domain/group
> configurations depending on power savings settings.

The above code helps a quad-core CPU to be treated as two dual core
for performance when sched_mc_power_savings=0 and they will be treated
as one quad core package if sched_mc_power_savings=1 since the power
control (voltage control) is per quad core socket.

On a dual socket machine with two quad core cpus,

sched_mc_power_savings=0 will build:

CPU0 attaching sched-domain:
domain 0: span 0,2 level MC
groups: 0 2
domain 1: span 0-7 level CPU
groups: 0,2 1,5 3-4 6-7

while sched_mc_power_savings=1 will build:

CPU0 attaching sched-domain:
domain 0: span 0,2-4 level MC
groups: 0 2 3 4
domain 1: span 0-7 level CPU
groups: 0,2-4 1,5-7

Last level cache (llc_shared_map) is used to build this map
differently based on power savings settings.

Do you think such detailed documentation around this code will help?

--Vaidy

[snipped]

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/