Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

From: Dave Jones
Date: Fri Aug 04 2006 - 22:47:41 EST


On Fri, Aug 04, 2006 at 10:23:56PM -0400, Dave Jones wrote:

> Duh. Everything becomes clearer the moment you post a diff to lkml.

Right, with that silly thinko out of the way, things make _slightly_
more sense, but I'm still puzzled. Here's the trace (with the DWARF
noise stripped out).

CPU1 called lock_cpu_hotplug() for app cpuspeed. recursive_depth=0
[<c0104edc>] show_trace_log_lvl+0x58/0x152
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
[<c029fbae>] store_scaling_governor+0x142/0x1a3
[<c029f1a5>] store+0x37/0x48
[<c01a6561>] sysfs_write_file+0xab/0xd1
[<c016f99f>] vfs_write+0xab/0x157
[<c016ffe4>] sys_write+0x3b/0x60
[<c0103db9>] sysenter_past_esp+0x56/0x8d
cpuspeed acquired cpu_bitmask_lock

CPU1 called lock_cpu_hotplug() for app cpuspeed. recursive_depth=0
[<c0104edc>] show_trace_log_lvl+0x58/0x152
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
[<c0132f3c>] __create_workqueue+0x52/0x122
[<f901234b>] cpufreq_governor_dbs+0x9f/0x2c3 [cpufreq_ondemand]
[<c029f7b6>] __cpufreq_governor+0x57/0xd8
[<c029f985>] __cpufreq_set_policy+0x14e/0x1bc
[<c029fbc5>] store_scaling_governor+0x159/0x1a3
[<c029f1a5>] store+0x37/0x48
[<c01a6561>] sysfs_write_file+0xab/0xd1
[<c016f99f>] vfs_write+0xab/0x157
[<c016ffe4>] sys_write+0x3b/0x60
[<c0103db9>] sysenter_past_esp+0x56/0x8d
Lukewarm IQ detected in hotplug locking
BUG: warning at kernel/cpu.c:46/lock_cpu_hotplug()

CPU1 called unlock_cpu_hotplug() for app cpuspeed. recursive_depth=1
[<c0104edc>] show_trace_log_lvl+0x58/0x152
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013e980>] unlock_cpu_hotplug+0x37/0xb7
[<c0132fea>] __create_workqueue+0x100/0x122
[<f901234b>] cpufreq_governor_dbs+0x9f/0x2c3 [cpufreq_ondemand]
[<c029f7b6>] __cpufreq_governor+0x57/0xd8
[<c029f985>] __cpufreq_set_policy+0x14e/0x1bc
[<c029fbc5>] store_scaling_governor+0x159/0x1a3
[<c029f1a5>] store+0x37/0x48
[<c01a6561>] sysfs_write_file+0xab/0xd1
[<c016f99f>] vfs_write+0xab/0x157
[<c016ffe4>] sys_write+0x3b/0x60
[<c0103db9>] sysenter_past_esp+0x56/0x8d

CPU1 called unlock_cpu_hotplug() for app cpuspeed. recursive_depth=0
[<c0104edc>] show_trace_log_lvl+0x58/0x152
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013e980>] unlock_cpu_hotplug+0x37/0xb7
[<c029fbe5>] store_scaling_governor+0x179/0x1a3
[<c029f1a5>] store+0x37/0x48
[<c01a6561>] sysfs_write_file+0xab/0xd1
[<c016f99f>] vfs_write+0xab/0x157
[<c016ffe4>] sys_write+0x3b/0x60
[<c0103db9>] sysenter_past_esp+0x56/0x8d

So in these traces we're seeing
lock
lock
unlock
unlock

But what I really don't understand is the ordering here.
Immediately after that first trace we should see an unlock.
store_scaling_governor does a lock/unlock pair, with no chance
of returning with the hotplug lock still held.

In the second trace however, cpuspeed is off doing something
completely different.

How can this happen?

My head hurts.

Dave

--
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/