Re: [PATCH] Avoid moving tasks when a schedule can be made.

From: Nick Piggin
Date: Mon Feb 06 2006 - 06:20:13 EST


Ingo Molnar wrote:
* Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:


But there are still places where interrupts can be held off for indefinite periods. I don't see why the scheduler lock is suddenly important [...]


the scheduler lock is obviously important because it's the most central lock in existence.


Now I call that argument much more illogical than any of mine. How can such a fine grained, essentially per-cpu lock be "central", let alone "most central"? [...]


i meant central in the sense that it's the most frequently taken lock, in the majority of workloads. Here's the output from the lock validator, sorted by number of ops per lock:

-> (dcache_lock){--} 124233 {
-> (&rt_hash_locks[i]){-+} 131085 {
-> (&dentry->d_lock){--} 312431 {
-> (cpa_lock){++} 507385 {
-> (__pte_lockptr(new)){--} 660193 {
-> (kernel/synchro-test.c:&mutex){--} 825023 {
-> (&rwsem){--} 930501 {
-> (&rq->lock){++} 2029146 {

the runqueue lock is also central in the sense that it is the most spread-out lock in the locking dependencies graph. Toplist of locks, by number of backwards dependencies:

15 -> &cwq->lock
15 -> nl_table_wait
15 -> &zone->lock
17 -> &base->t_base.lock
32 -> modlist_lock
38 -> &cachep->spinlock
46 -> &parent->list_lock
47 -> &rq->lock

(obviously, as no other lock must nest inside the runqueue lock.)

so the quality of code (including asymptotic behavior) that runs under the runqueue lock is of central importance. I didnt think i'd ever have to explain this to you, but it is my pleasure to do so ;-) Maybe you thought of something else under "central"?


That's all very interesting, but is purely based on some statistical
analysis of some workload of your choice, and ignores other possible
measures like contention or average/max hold times.

One could concoct a workload for which some other lock is the "*most*
central", strangely enough.

Anyway I'm not trying to say that the runqueue locks are not important,
nor that large latencies should not be avoided in the mainline kernel..
But we've known about this theoretical latency for years, ditto for
rwsem latency (which can also be a central lock in some workloads). I
was simply surprised that you just now jump into action when you see
an equally theoretical workload show you what you already know.

Enough talk from me. I'm looking at a couple of ideas which might help
to improve this too.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com -
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/