Re: cpu stopper threads and load balancing leads to deadlock

From: Mike Galbraith
Date: Thu May 03 2018 - 23:39:03 EST


On Thu, 2018-05-03 at 18:45 +0200, Peter Zijlstra wrote:
>
> Something like so perhaps? Mike, can you play around with that? Could
> burn your granny and eat your cookies.

That worked, and nothing entertaining has happened.. yet. Hm, I could
use this kernel to update my backup drive, if there's a cookie monster
lurking, that might get its attention :)

> diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
> index 7468de429087..07360523c3ce 100644
> --- a/arch/x86/kernel/cpu/mtrr/main.c
> +++ b/arch/x86/kernel/cpu/mtrr/main.c
> @@ -793,6 +793,9 @@ void mtrr_ap_init(void)
>
> if (!use_intel() || mtrr_aps_delayed_init)
> return;
> +
> + rcu_cpu_starting(smp_processor_id());
> +
> /*
> * Ideally we should hold mtrr_mutex here to avoid mtrr entries
> * changed, but this routine will be called in cpu boot time,
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 2a734692a581..4dab46950fdb 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3775,6 +3775,8 @@ int rcutree_dead_cpu(unsigned int cpu)
> return 0;
> }
>
> +static DEFINE_PER_CPU(int, rcu_cpu_started);
> +
> /*
> * Mark the specified CPU as being online so that subsequent grace periods
> * (both expedited and normal) will wait on it. Note that this means that
> @@ -3796,6 +3798,11 @@ void rcu_cpu_starting(unsigned int cpu)
> struct rcu_node *rnp;
> struct rcu_state *rsp;
>
> + if (per_cpu(rcu_cpu_started, cpu))
> + return;
> +
> + per_cpu(rcu_cpu_started, cpu) = 1;
> +
> for_each_rcu_flavor(rsp) {
> rdp = per_cpu_ptr(rsp->rda, cpu);
> rnp = rdp->mynode;
> @@ -3852,6 +3859,8 @@ void rcu_report_dead(unsigned int cpu)
> preempt_enable();
> for_each_rcu_flavor(rsp)
> rcu_cleanup_dying_idle_cpu(cpu, rsp);
> +
> + per_cpu(rcu_cpu_started, cpu) = 0;
> }
>
> /* Migrate the dead CPU's callbacks to the current CPU. */