Re: workqueue deadlock
From: Andrew Morton
Date: Sun Dec 10 2006 - 07:17:10 EST
On Sun, 10 Dec 2006 12:49:43 +0100
Ingo Molnar <mingo@xxxxxxx> wrote:
> > > void cpu_hotplug_lock(void)
This is actually not cpu-hotplug safe ;)
> > > {
> > > int cpu = raw_smp_processor_id();
> > > /*
> > > * Interrupts/softirqs are hotplug-safe:
> > > */
> > > if (in_interrupt())
> > > return;
> > > if (current->hotplug_depth++)
> > > return;
<preempt, cpu hot-unplug, resume on different CPU>
> > > current->hotplug_lock = &per_cpu(hotplug_lock, cpu);
<use-after-free>
> > > mutex_lock(current->hotplug_lock);
And it sleeps, so we can't use preempt_disable().
> > > }
It's worth noting that this very common sequence:
preempt_disable();
cpu = smp_processor_id();
...
preempt_enable();
also provides cpu-hotunplug protection against scenarios such as the above.
> > That's functionally equivalent to what we have now, and it isn't
> > working too well.
>
> hm, i thought the main reason of not using cpu_hotplug_lock() in a
> widespread manner was not related to its functionality but to its
> scalability - but i could be wrong.
It hasn't been noticed yet.
I suspect a large part of the reason for that is that we only really care
about hot-unplug when this CPU reaches across to some other CPU's data. Often
_all_ other CPU's data. And that's a super-inefficient thing, so it's rare.
Most of the problems we've had are due to borkage in cpufreq. And that's
simply cruddy code - it's not due to the complexity of CPU hotplug per-se.
> The one above is scalable and we
> could use it as /the/ method to control CPU hotplug. All the flux i
> remember related to cpu_hotplug_lock() use from the fork path and from
> other scheduler hotpaths related to its scalability bottleneck, not to
> its locking efficiency.
One quite different way of addressing all of this is to stop using
stop_machine_run() for hotplug synchronisation and switch to the swsusp
freezer infrastructure: all kernel threads and user processes need to stop
and park themselves in a known state before we allow the CPU to be removed.
lock_cpu_hotplug() becomes a no-op.
Dunno if it'll work - I only just thought of it. It sure would simplify
things.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/