[RFC 0/5] lock_cpu_hotplug: Redesign - A "lightweight" scalable version.

From: Gautham R Shenoy
Date: Thu Oct 26 2006 - 06:49:03 EST


Hello Everyone,

My previous attempt to redesign cpu_hotplug locking had raised certain issues
like
- Setting right cpufreq's cpu_hotplug_locking.
- Using per-subsystem locks instead of a global lock.
- Scalability of a global lock.
- Unfairness proving a potential drawback.

This patch set attempts to address these issues.

Setting right cpufreq's cpu_hotplug_locking
-------------------------------------------
lock_cpu_hotplug is required in cpufreq subsystem to prevent a hotplug
event from occuring while we are changing frequencies on a group of cpus.
The *ideal* place to have (un)lock_cpu_hotplug is around cpufreq_driver->target
inside __cpufreq_driver_target.

But unfortunately, __cpufreq_driver_target is called from so many different
places,
(http://gautham.shenoy.googlepages.com/cpufreq_driver_target_callers.txt)
including the hot-cpu-callback path, that it becomes impossible to
have lock_cpu_hotplug inside __cpufreq_driver_target.

Which is why all the lock_cpu_hotplug calls had to be moved up the subsystem
to ensure that any code-path which might trigger cpu-frequency change would
take lock_cpu_hotplug.
The patch http://lkml.org/lkml/2006/7/26/140 by Arjan accomplishes this.

However, the problem of lock_cpu_hotplug being called from a hot-cpu-callback
path still remains unsolved. This patch-set attempts to free the callback path
of any calls to lock_cpu_hotplug.

Per-subsystem hotplug locks.
----------------------------
It *is* possible to go for per-subsytem locks as done in case of workqueue.c
using workqueue_mutex, and register the hot-cpu notifier with an appropriate
priority.

I have tried this approach for kernel/sched.c and mm/slab.c.
This seemed to be working fine, except for the false positives raised by
lockdep which was because of holding these mutexes between
cpu_chain_mutex lock/unlock.

My big concern about this is the possible interdependency of subsystems
now to get locking order correct, which can be an ugly thing to solve by
itself at the compile time. Although I havent been able to find a concrete
example for such a dependency (except of course the famous
cpufreq_ondemand-workqueue one), implicitly relying on the fact that such a
possibility will not arise is, IMHO, "not the right thing".

Which is why I feel lock_cpu_hotplug deserves one more chance (though
it seriously needs a better name to suit it's refcount implementation).

Scalability of the global lock.
------------------------------------------
The proposed locking schema is *extremely* lightweight in the reader-fast path.
This is achieved by using a per-cpu refcount which will be bumped up/down
depending on the read_lock or read_unlock.

In the reader-slow path, which is extremely rare, the lock becomes a unfair
rw-sem i.e writers assume control *only* when there are no readers in the
system.

The writer, on arrival would just set the writer_flag and do a
synchronize_sched() to allow all the pending readers to finish and prompt the
new readers to take the slowpath.Paul McKenney suggested this approach.

If there are readers in the system, the writer will sleep and will be
woken up by the last *reader*.

Unfairness-proving to be a problem.
------------------------------------
I could not find any system calls that would allow the users to exploit
unfairness. Hoping that I haven't missed out anything important, I have
decided to retain the unfair nature of lock_cpu_hotplug.

The patch set is as follows.
[patch 1/5]: cpufreq subsystem cleanup to fix coding style issues and other
trivial issues. Hope this improves the readability of this code.

[patch 2/5]: Eliminate lock_cpu_hotplug in cpufreq's hotcpu callback path.

[patch 3/5]: Use (un)lock_cpu_hotplug instead of workqueue_mutex.
This fixes the lockdep warnings, for holding workqueue_mutex
between cpu_chain lock/unlock!

[patch 4/5]: Proposed new design for cpu_hotplug lock.

[patch 5/5]: Add lockdep support to the proposed cpu_hotplug lock.

These patches are against linux-2.6.19-rc2-mm2.

Looking forward to your feedback.

Thanks and Regards
gautham.
--
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/