* Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
On Wed, 2005-08-17 at 10:24 -0400, Steven Rostedt wrote:
OK the output from netconsole still seems like netconsole itself is
causing some problems. But I think it is also showing this lockup. I'll
recompile my kernel as UP and see if netconsole works fine.
Well, the UP kernel boots on my laptop, but netconsole gives strange warnings.
OK, what's the scoop with the illegal_API_call? What is it about, and what is the expected work around?
this is a recent change: i've started flagging "naked" use of local_irq_disable(), because it's a problem on PREEMPT_RT and it's a potential SMP bug on upstream kernels. A local_irq_disable() is converted either to raw_local_irq_disable() when justified (it's mostly only justified for lowlevel arch code), or is eliminated totally. (either by merging it into a nearby spin_lock API call, or by removing it altogether, making sure that code doesnt break).
Right now we print a warning on the first such API use, and then shut up about it. All local_irq_() APIs map to NOPs. (we keep the PF_IRQSOFF flag for compatibility, but only to get irqs_off() right which in turn shuts off a number of BUG_ON(!irqs_disabled()) warnings, and it doesnt have any other functional purpose.)
the desired end-result would be the total elimination of local_irq_*() API calls.
I'm also getting the following output on shutdown:
NET: Registered protocol family 31
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: L2CAP ver 2.7
Bluetooth: L2CAP socket layer initialized
Bluetooth: RFCOMM ver 1.5
Bluetooth: RFCOMM socket layer initialized
Bluetooth: RFCOMM TTY layer initialized
BUG: nonzero lock count 1 at exit time?
nfsd: 4696 [f7183830, 115]
[<c0136922>] check_no_held_locks+0x62/0x330 (8)
[<c011df67>] do_exit+0x257/0x480 (32)
[<c013d052>] __module_put_and_exit+0x52/0x70 (40)
[<f8d54583>] nfsd+0x2b3/0x340 [nfsd] (12)
[<f8d542d0>] nfsd+0x0/0x340 [nfsd] (48)
[<c010140d>] kernel_thread_helper+0x5/0x18 (16)
---------------------------
| preempt count: 00000000 ]
| 0-level deep critical section nesting:
----------------------------------------
------------------------------
| showing all locks held by: | (nfsd/4696 [f7183830, 116]):
------------------------------
#001: [c038e184] {kernel_sem.lock}
... acquired at: lock_kernel+0x21/0x40
BUG: nfsd/4696, BKL held at task exit time!
hm, it seems nfsd forgets to do an unlock_kernel() in some exit path it seems? We are enforcing strict balanced lock use in PREEMPT_RT - the upstream kernel is more relaxed about it.
And it goes on and on. This happens everytime. Without netconsole, I
only get the nonzero lock count error. Also, one of my lockups on SMP
had to do with the kernel_thread_helper:
Using IPI Shortcut mode
khelper/794[CPU#0]: BUG in set_new_owner at kernel/rt.c:916
this is a 'must not happen'. Somehow lock->held list got non-empty. Maybe some use-after-free thing? Havent seen it myself.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/