Re: workqueue: WARN at at kernel/workqueue.c:2176
From: Peter Zijlstra
Date:  Fri May 16 2014 - 05:35:56 EST
On Fri, May 16, 2014 at 11:50:42AM +0800, Lai Jiangshan wrote:
> After debugging, I found the hotlug-in cpu is atctive but !online in this case.
> the problem was introduced by 5fbd036b.
> Some code assumes that any cpu in cpu_active_mask is also online, but 5fbd036b breaks
> this assumption, so the corresponding code with this assumption should be changed too.
Good find, and yes it does that.
> The following patch is just a workaround. After it is applied, the above WARNING
> is gone, but I can't hit the wq problem that you found.
Seeing how the entirety of hotplug is basically duct tape and twigs, the
below isn't that bad.
> ---
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index a9e710e..253a129 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -726,9 +726,10 @@ void set_cpu_present(unsigned int cpu, bool present)
>  
>  void set_cpu_online(unsigned int cpu, bool online)
>  {
> -	if (online)
> +	if (online) {
>  		cpumask_set_cpu(cpu, to_cpumask(cpu_online_bits));
> -	else
> +		cpumask_set_cpu(cpu, to_cpumask(cpu_active_bits));
> +	} else
>  		cpumask_clear_cpu(cpu, to_cpumask(cpu_online_bits));
>  }
>  
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 268a45e..c1a712d 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5043,7 +5043,6 @@ static int sched_cpu_active(struct notifier_block *nfb,
>  				      unsigned long action, void *hcpu)
>  {
>  	switch (action & ~CPU_TASKS_FROZEN) {
> -	case CPU_STARTING:
>  	case CPU_DOWN_FAILED:
>  		set_cpu_active((long)hcpu, true);
>  		return NOTIFY_OK;
Attachment:
pgptiVPeZL7zC.pgp
Description: PGP signature