Re: [PATCH] fix-flush_workqueue-vs-cpu_dead-race-update

From: Srivatsa Vaddagiri
Date: Mon Jan 08 2007 - 10:37:50 EST


On Sun, Jan 07, 2007 at 11:59:57AM -0800, Andrew Morton wrote:
> > How would this provide a stable access to cpu_online_map in functions
> > that need to block while accessing it (as flush_workqueue requires)?
>
> If a thread simply blocks, that will not permit a cpu plug/unplug to proceed.
>
> The thread had to explicitly call try_to_freeze(). CPU plug/unplug will
> not occur (and cpu_online_map will not change) until every process in the
> machine has called try_to_freeze()).

Maybe my misunderstanding of the code, but:

Looking at the code, it seems to me that try_to_freeze() will be called very
likely from the signal-delivery path (get_signal_to_deliver()). Isnt
that correct? If so, consider a thread as below:


Thread1 :
for_each_online_cpu() { online_map = 0x1111 at this point }
do_some_thing();
kmalloc(); <- blocks


Can't Thread1 be frozen in the above blocking state w/o it voluntarily
calling try_to_freeze? If so, online_map can change when it returns
from kmalloc() ..

> So the problem which you're referring to will only occur if a workqueue
> callback function calls try_to_freeze(), which would be mad.
>
>
> Plus flush_workqueue() is on the way out. We're slowly edging towards a
> working cancel_work() which will only block if the work which you're trying
> to cancel is presently running. With that, pretty much all the
> flush_workqueue() calls go away, and all these accidental rarely-occurring
> deadlocks go away too.

Fundamentally, I think it is important to give the ability to block
concurrent hotplug operations from happening ..

--
Regards,
vatsa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/