Re: [PATCH RFC] sched: Make wake_up_nohz_cpu() handle CPUs going offline

From: Peter Zijlstra
Date: Tue Jul 12 2016 - 10:20:00 EST


On Fri, Jul 01, 2016 at 01:29:59AM +0200, Frederic Weisbecker wrote:
> > void wake_up_nohz_cpu(int cpu)
> > {
> > - if (!wake_up_full_nohz_cpu(cpu))
> > + if (cpu_online(cpu) && !wake_up_full_nohz_cpu(cpu))
>
> So at this point, as we passed CPU_DYING, I believe the CPU isn't visible in the domains
> anymore (correct me if I'm wrong),

So rebuilding the domains is an utter trainwreck atm. But I suspect
that's wrong. Esp. with cpusets enabled we rebuild the domains very late
from a workqueue.

That is why the scheduler has cpu_active_mask to constrain the domains
during hotplug.

Now I need to go sort through that trainwreck because deadline needs it,
but I've not had the opportunity :/

> therefore get_nohz_timer_target() can't return it,
> unless smp_processor_id() is the only alternative.

With the below that should be true I think.

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 6c0cdb5a73f8..b35cacbe9b9e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -626,7 +626,7 @@ int get_nohz_timer_target(void)

rcu_read_lock();
for_each_domain(cpu, sd) {
- for_each_cpu(i, sched_domain_span(sd)) {
+ for_each_cpu_and(i, sched_domain_span(sd), cpu_active_mask) {
if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
cpu = i;
goto unlock;