Re: [PATCH 2/3] work_on_cpu: Use our own workqueue.

From: Oleg Nesterov
Date: Tue Jan 27 2009 - 12:59:27 EST


On 01/27, Rusty Russell wrote:
>
> On Tuesday 27 January 2009 08:47:29 Ingo Molnar wrote:
> >
> > * Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> >
> > > But "[PATCH 1/3] work_on_cpu: dont try to get_online_cpus() in
> > > work_on_cpu." removes get_online_cpus/put_online_cpus, this means the
> > > work can run on the wrong CPU anyway. Or work_on_cpu() can hang forever
> > > if CPU has already gone away before queue_work_on().
> > >
> > > Confused.
> >
> > The idea was to require work_on_cpu() users to be CPU hotplug-safe. But
> > ... Rusty pointed it out in the past that this might be fragile, and we
> > could put back the get_online_cpus()/put_online_cpus() calls.
>
> Old code used to do:
>
> tmp = current->cpus_allowed;
> set_cpus_allowed(current, cpumask_of_cpu(cpu));
> function(arg);
> set_cpus_allowed(current, tmp);
>
> We replaced it with:
>
> work_on_cpu(cpu, function, arg);
>
> I thought I'd be clever and reliably check that the cpu they asked for
> was online inside work_on_cpu. Leading to locking problems. But if they
> didn't previously ensure cpu hotplug didn't happen, they were buggy already,
> so I took out the check and hence the hotplug lock.
>
> So we're no *worse* than we were before, but yes, an audit would probably
> lead to fixes.

I agree, we are no worse than we were before.

But I can't understand why we can't be better ;)

If we add the special workqueue for work_on_cpu() (patch 2/3), then why
do we need [PATCH 1/3] ?

IOW, given that the 2nd patch adds the special wq, which locking problems
solves the 1st patch?

Perhaps I missed something, and the patches are questionable anyway,
but I think these 2 patches are "conflicting".

If we use a special wq, then work_on_cpu()->get_online_cpus() is fine,
it can't deadlock with cpu_hotplug_begin().

No?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/