Re: [PATCH 15/21] workqueue: remove worker_maybe_bind_and_lock()

From: Tejun Heo
Date: Thu Mar 21 2013 - 13:41:42 EST


Hello, Lai.

On Thu, Mar 21, 2013 at 07:03:30PM +0800, Lai Jiangshan wrote:
> On Thu, Mar 21, 2013 at 2:10 AM, Tejun Heo <tj@xxxxxxxxxx> wrote:
> > On Wed, Mar 20, 2013 at 03:28:15AM +0800, Lai Jiangshan wrote:
> >> static struct worker *alloc_worker(void)
> >> {
> >> struct worker *worker;
> >> @@ -2326,7 +2262,8 @@ repeat:
> >> spin_unlock_irq(&wq_mayday_lock);
> >>
> >> /* migrate to the target cpu if possible */
> >> - worker_maybe_bind_and_lock(pool);
> >> + set_cpus_allowed_ptr(current, pool->attrs->cpumask);
> >> + spin_lock_irq(&pool->lock);
> >> rescuer->pool = pool;
> >
> > You can't do this. The CPU might go up between set_cpus_allowed_ptr()
> > and spin_lock_irq() and the rescuer can end up executing a work item
> > which was issued while the target CPU is up on the wrong CPU.
>
>
> Why it is wrong?
> It can be allowed if the wq has rescuer. (wq of work_on_cpu() don't has rescuer,
> it does not hurt to work_on_cpu() of something else.
>
> Actually it was allowed by recent patches.
> If a cpu is up while a rescuer do process_scheduled_workers(),
> all the later work item will be running the wrong CPU while
> the target CPU is up.(due to for_cpu_pool_workers() don't include
> active rescuer).

So,

* If the user wraps queueing and flushing inside get_online_cpus(), it
better gets executed and finishes on the target CPU.

* Similarly, it must guaranteed that any work items queued and flushed
by CPU_ONLINE or CPU_DOWN_PREP notifiers should start and finish
execution on the target CPU.

Your patch breaks the above.

> I don't want to go back to make cpuhotplug nor rescuer_theread become
> complicated.
> so I prefer to allow work item can run on wrong cpu if the wq has rescuer.
> All just needs a comments.

It is explained in the comment.

> I guess you will agaist me...... let me rewrite it .....

It's not because I'm "against" you. It's because it's buggy and we
wouldn't have any affinity guarantee with the proposed changes. If
you can maintain the guarantees and remove
worker_maybe_bind_and_lock(), please be my guest.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/