Re: [PATCH 32/35] clockevents: Fix cpu down race for hrtimer based broadcasting

From: Nicolas Pitre
Date: Sat Feb 21 2015 - 12:46:47 EST


On Sat, 21 Feb 2015, Peter Zijlstra wrote:

> On Thu, Feb 19, 2015 at 12:51:52PM -0500, Nicolas Pitre wrote:
> >
> > This breaks the b.L switcher disabling code which essentially does:
> >
> > static void bL_switcher_restore_cpus(void)
> > {
> > int i;
> >
> > for_each_cpu(i, &bL_switcher_removed_logical_cpus) {
> > struct device *cpu_dev = get_cpu_device(i);
> > int ret = device_online(cpu_dev);
> > if (ret)
> > dev_err(cpu_dev, "switcher: unable to restore CPU\n");
> > }
> > }
>
> Just so I understand, this device_{on,ofF}line() stuff is basically just
> cpu_{up,down}() but obfuscated through the device model nonsense, right?

Right. Before commit 3f8517e793 cpu_{up,down}() were used directly.

> Also it seems bL_switcher_enable() relies on lock_device_hotplug() to
> stabilize the online cpu mask; it does not, only the hotplug lock does.

This is there to prevent any concurrent hotplug operations via sysfs.
Once the switcher is active, we deny hotplugging operations on any
physical CPU the switcher has removed.

> I'm having a very hard time trying to follow wth this thing all is
> doing; its using hotplug but its also doing magic with cpu_suspend().

You're fully aware of the on-going work on the scheduler to better
support the big.LITTLE architecture amongst other things. The switcher
is an interim solution where one big CPU is paired with one little CPU,
and that pair is conceptually used as one logical CPU where only one of
the big or little physical CPU runs at a time. Those logical CPUs have
identical capacities therefore the current scheduler may works well with
them.

The switch between the two physical CPUs is abstracted behind an
extended cpufreq scale i.e. when cpufreq asks for a frequency exceeding
the little CPU then a transparent switch is made to the big CPU. The
transparent switch is performed by suspending the current CPU and
immediately resuming the same context on the other CPU, hence the
cpu_suspend() usage.

The switcher is runtime activated. To do so, one physical CPU per
logical pairing is hotplugged out so the system considers only the
"logical" CPUs. When the switcher is disabled, those CPUs are brought
back online.

> /me confused..

If you want more background info, you may have a look at this article:

http://lwn.net/Articles/481055/

Or just ask away.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/