Re: [PATCH 5/8] cpu: cpu-hotplug deadlock

From: Oleg Nesterov
Date: Tue Apr 29 2008 - 12:45:44 EST


On 04/29, Peter Zijlstra wrote:
>
> The only thing that changed is that the mutex is not held; so what we
> change is:
>
> LOCK
>
> ... do the full hotplug thing ...
>
> UNLOCK
>
> into
>
> LOCK
> set state
> UNLOCK
>
> ... do the full hotplug thing ...
>
> LOCK
> unset state
> UNLOCK
>
> So that the lock isn't held over the hotplug operation.

Well, yes I see, but... Ugh, I have a a blind spot here ;)

why this makes any difference from the semantics POV ? why it is bad
to hold the mutex throughout the "full hotplug thing" ?

> > (actually, since write-locks should be very rare, perhaps we don't need
> > 2 wait_queues ?)
>
> And just let them race the wakeup race, sure that might work. Gautham
> even pointed out that it never happens because there is another
> exclusive lock on the write path.
>
> But you say you like that it doesn't depend on that anymore - me too ;-)

Yes. but let's suppose we have the single wait_queue, this doesn't make
any difference from the correctness POV, no?

To clarify: I am not arguing! this makes sense, but I'm asking to be sure
I didn't miss a subtle reason why do we "really" need 2 wait_queues.

Also. Let's suppose we have both read- and write- waiters, and cpu_hotplug_done()
does wake_up(writer_queue). It is possible that another reader comes and does
get_online_cpus() and increments .refcount first. After that, cpu_hotplug
is "opened" for the read-lock, but other read-waiters continue to sleep, and
the final put_online_cpus() wakes up write-waiters only. Yes, this all is
correct, but not "symmetrical", and leads to the question "do we really need
2 wait_queues" again.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/