Re: [PATCH] flush_cpu_workqueue: don't flush an empty ->worklist

From: Oleg Nesterov
Date: Tue Jan 16 2007 - 08:28:51 EST


On 01/16, Srivatsa Vaddagiri wrote:
>
> On Mon, Jan 15, 2007 at 07:55:16PM +0300, Oleg Nesterov wrote:
> >
> > As I said already __create_workqueue() needs a fix, schedule_on_each_cpu()
> > is already broken, and should be fixed as well.
>
> __create_workqueue() creates worker threads for all online CPUs
> currently. Accessing the online_map could be racy unless we
> serialize the access with hotplug event (thr' a mutex like workqueue
> mutex held between LOCK_ACQ/LOCK_RELEASE messages or process freezer)
> OR take special measures as was done in flush_workqueue. How were
> you considering to deal with that raciness?

This is easy to fix in multiple ways. For example (this is not the best
approach), we can make cpu_creation_map. Like cpu_populate_map, but CPU
is cleared on CPU_DEAD/CPU_UP_CANCELED.

I'll continue when I have a time.

> > > What abt stopping that thread in CPU_DOWN_PREPARE (before freezing
> > > processes)? I understand that it may add to the latency, but compared to
> > > the overall latency of process freezer, I suspect it may not be much.
>
> I meant issuing kthread_stop() in DOWN_PREPARE so that worker
> thread exits itself (much before CPU is actually brought down).

Deadlock if work_struct re-queues itself.

> Even if there are problems with it, how abt something like below:
>
> workqueue_cpu_callback()
> {
>
> CPU_DEAD:
> /* threads are still frozen at this point */
> take_over_work();

No, we can't move a currently executing work. This will break flush_workxxx().

> kthread_mark_stop(worker_thread);
> break;
>
> CPU_CLEAN_THREADS:
> /* all threads resumed by now */
> kthread_stop(worker_thread); /* task_struct ref required? */

yes, required,

> kthread_mark_stop() will mark somewhere in task_struct that the thread
> should exit when it comes out of refrigerator.
>
> worker_thread()
> {
>
> while (!kthread_should_stop()) {
> if (cwq->freezeable)
> try_to_freeze();
>
> if (kthread_marked_stop(current))
> break;

Racy. Because of kthread_stop() above we should clear cwq->thread somehow.
But we can't do this: this workqueue may be already destroyed.

> The advantage I see above is that, when take_over_work() is running, we wont
> race with functions like flush_workqueue() (threads still frozen at that
> point) and hence we avoid hacks like migrate_sequence.

See above.

Please note that the code I posted does something like kthread_mark_stop(), but
it operates on cwq, not on task_struct, this makes a big difference. And it doesn't
need take_over_work() at all. And it doesn't need additional complications. Instead,
it lessens both the source and compiled code.

Note that I am not against using freezer for cpu-hotplug. My point is that this
can't help much for the current workqueue.c, and it will completely transparent
with the change I suggest.

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/