Re: 2.6.13-rc6-rt6
From: Steven Rostedt
Date: Wed Aug 17 2005 - 09:05:54 EST
On Wed, 2005-08-17 at 08:47 +0200, Ingo Molnar wrote:
>
> unfortunately the space of "patches that break the kernel" is infinitely
> larger (and infinitely easier to generate) than the space of "patches
> that improve the kernel" - so i'll skip your patch for now ;-)
Yeah, I took out my "bad" fix and it still locks up. Since I got the NMI
working I was able to get output. Unfortunately, the laptop doesn't have
a serial (damn vendors getting rid of the very useful _for us_ uart).
And so far the netconsole doesn't seem to be working yet.
Here's the output (typed from screen).
Freeing unused kernel memory: 296k freed
NMI watchdog detected lockup on CPU#1 (50000/50000)
Pid: 796, comm: hotplug
EIP: 0060:[<c0331662>] CPU: 1
EIP is at _raw_spin_lock+0x72/0xa0
EFLAGS: 00000002 Not tainted (2.6.13-rc6-rt6)
EXI: 00000001 EBX: .... (I'm not about to type all the registers out).
[...]
try_to_wake_up+0x4e
__reacquire_kernel_lock+0x2d
pick_new_owner+0x6b
wake_up_process+0x30
rt_up+0x290
chrdev_open+0xfd
lock_kernel+0x28
chrdev_open+0xfd
tty_open+0x0
chrdev_open+0xfd
file_move+0x20
dentry_open+0x17a
filp_open+0x68
_spin_lock+0x23
get_unused_fd+0xa2
sys_open+0x4f
syscall_call+0x7
----
preempt count: 3
3-level deep critical section nesting:
----
rt_up+0x2c
add_preempt_count+0x1a
add_preempt_count+0x1a
----
showing all locks held by: (hotplug/796 [...])
- No locks where shown.
>
> how come it is stopping the machine during bootup? Or does the lockup
> occur during shutdown? changing the local_irq_disable() to
> raw_local_irq_disable() looks wrong because sooner or later we hit
> complete(). You are probably locking up earlier than that though,
> perhaps in stopmachine_set_state()?
This is on bootup. The stopmachine is used in sys_init_module.
>
> but stop_machine() looks quite preempt-unsafe to begin with. The
> local_irq_disable() would not be needed at all if prior the
> for_each_online_cpu() loop we'd use set_cpus_allowed. The current method
> of achieving 'no preemption' is simply racy even during normal
> CONFIG_PREEMPT.
I guess I'll need to look into this further.
-- Steve
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/