Re: [RFC][PATCH 0/4] Redesign cpu_hotplug locking.

From: Dipankar Sarma
Date: Sun Aug 27 2006 - 02:09:17 EST


On Sat, Aug 26, 2006 at 03:04:22PM -0700, Andrew Morton wrote:
> On Sat, 26 Aug 2006 14:09:55 -0700 (PDT)
> Linus Torvalds <torvalds@xxxxxxxx> wrote:
>
> > I definitely want to have this fixed, and Gautham's patches look like a
> > good thing to me, but the "trying to fix up the current mess" part was
> > really about trying to get 2.6.18 in a mostly working state rather than
> > anything else. I think it's been too late to try to actually _fix_ it for
> > 2.6.18 for a long time already.
> >
> > So my reaction is that this redesign should go in asap after 2.6.18,
> > unless people feel strongly that the current locking has so many bugs that
> > people can actually _hit_ in practice that it's better to go for the
> > redesign early.
>
> It certainly needs a redesign. A new sort of lock which makes it appear to
> work won't fix races like:
>
> int cpufreq_update_policy(unsigned int cpu)
> {
> struct cpufreq_policy *data = cpufreq_cpu_get(cpu);
>
> ...
>
> lock_cpu_hotplug();
>

The problem with cpufreq was that it used lock_cpu_hotplug()
in some common routines because it
was needed in some paths and then also called the same routines
from the CPU hotplug callback path. That is easily fixable and
Gautham's patch 1/4 does exactly that.
One thing I have privately suggested to Gautham is to do an audit
of bad lock_cpu_hotplug() uses.

Now coming to the read-side of lock_cpu_hotplug() - cpu hotplug
is a very special asynchronous event. You cannot protect against
it using your own subsystem lock because you don't control
access to cpu_online_map. With multiple low-level subsystems
needing it, it also becomes difficult to work out the lock
hierarchies. The right way to do this is what Gautham and Ingo
are discussing - a scalable rw semaphore type lock that allows
recursive readers.

>
> I rather doubt that anyone will be hitting the races in practice. I'd
> recommend simply removing all the lock_cpu_hotplug() calls for 2.6.18.

I don't think that is a good idea. The right thing to do would be to
do an audit and clean up the bad lock_cpu_hotplug() calls. People
seem to have just got lazy with lock_cpu_hotplug().

Thanks
Dipankar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/