Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock

From: Oleg Nesterov
Date: Sun Dec 14 2014 - 14:22:01 EST


On 12/12, David Hildenbrand wrote:
>
> > This is subjective, but how about
> >
> > static bool xxx(void)
> > {
> > mutex_lock(&cpu_hotplug.lock);
> > if (atomic_read(&cpu_hotplug.refcount) == 0)
> > return true;
> > mutex_unlock(&cpu_hotplug.lock);
> > return false;
> > }
> >
> > void cpu_hotplug_begin(void)
> > {
> > cpu_hotplug.active_writer = current;
> >
> > cpuhp_lock_acquire();
> > wait_event(&cpu_hotplug.wq, xxx());
> > }
> >
> > instead?
> >
> > Oleg.
> >
>
> [ 50.662459] do not call blocking ops when !TASK_RUNNING; state=2 set at [<000000000017340e>] prepare_to_wait_event+0x7a/0x124
> [ 50.662472] ------------[ cut here ]------------
> [ 50.662475] WARNING: at kernel/sched/core.c:7301
> [ 50.662477] Modules linked in:
> [ 50.662482] CPU: 5 PID: 225 Comm: cpu_start_stop. Not tainted 3.18.0+ #59
> [ 50.662485] task: 0000000001f94b20 ti: 0000000001ffc000 task.ti: 0000000001ffc000
> ...
>
> Looks like your suggestion won't work. We can only set the task to
> TASK_UNINTERRUPTIBLE after taking the lock.

Yeees, this warning (and wait_woken() helpers) was specially added
to catch/fix the problem like this, sorry for confusion.

Easy to fix, just

- mutex_lock(&cpu_hotplug.lock);
+ if (!mutex_trylock(&cpu_hotplug.lock))
+ return false;

If .lock is locked then it is hold by get_online_cpus(), and it is going
to increment the counter.

I would like to say that this is what I actually meant but now I can not
recall if this is true ;)

But please ignore. Your next version looks simple/clear enough.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/