Re: [PATCH 7/8] Clean up workqueue.c with respect to the freezer based cpu-hotplug

From: Srivatsa Vaddagiri
Date: Tue Apr 03 2007 - 13:11:35 EST


On Tue, Apr 03, 2007 at 07:03:36PM +0400, Oleg Nesterov wrote:
> I think it would be nice to do. I believe we can cleanup ksoftirqd()
> and migration_thread() as well (kill wait_to_die: loop). Probably it

I doubt whether we can kill it in migration_thread, since that is
another thread which is unfrozen for hotplug (stop_machine relies on its
services while rest of the world is frozen).

> is better to introduce a new helper for that, kthread_thaw_stop() or
> something.

Will think of that.


> > Why?
>
> What if is_single_threaded(wq) == true? In that case we should call
> flush_cpu_workqueue(cpu) only if cpu == singlethread_cpu, otherwise
> this is unneeded and wrong, because per_cpu_ptr(wq->cpu_wq, cpu) was
> not initialized.

Ah yes ..

> > kthread_stop(p)
> > {
> > int old_exempt_flags;
> >
> > task_lock(p);
> > old_exempt_flags = p->flags;
> > p->flags |= PFE_ALL; /* Exempt 'p' from being frozen? */
>
> I agree, we should mark this thread as non-freezable, but we can't modify
> p->flags, this is racy. "current" owns its ->flags and it is not atomic.
> Note that thaw_process() checks frozen(p) when it clears PF_FROZEN.

I suspected that we cannot modify p->flags just like that. How abt
moving freezer exemption bits to a separate field, which is protected by
task_lock?

> Actually, we should do this before destroy_workqueue() calls flush_workqueue().
> Otherwise flush_cpu_workqueue() can hang forever in a similar manner.

Yep. I guess these are a class of freezer deadlocks very similar to vfork
parent waiting on child case. I get a feeling these should become common
outside of kthread too (A waits on B for something, B gets frozen, which
means A won't freeze causing freezer to fail). Can freezer detect this
dependency somehow and thaw B automatically? Probably not that easy ..

> Needs more thinking, I guess.

[snip]

> No, no, workqueue_mutex can't help. Just for example: CPU_UP_PREPARE completes
> and drops workqueue_mutex. __create_workqueue(wq) doesn't see the new cpu, it
> is not on cpu_online_map, so it doesn't create cwq->thread. CPU_ONLINE oopses.

Ok ..sure.

--
Regards,
vatsa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/