Re: Inquiry: Should we remove "isolcpus= kernel boot option? (may have realtime uses)

From: Dimitri Sivanich
Date: Tue Jun 03 2008 - 10:40:26 EST


On Mon, Jun 02, 2008 at 02:59:34PM -0700, Max Krasnyansky wrote:
> Yes it used to be somewhat unstable. These days it solid. I'm using it on a
> wide range of systems: uTCA Core2Duo, NUMA dual-Opteron, 8way Core2, etc. And
> things work as expected.

Max,

I tried the following scenario on an ia64 Altix running 2.6.26-rc4 with cpusets compiled in but cpuset fs unmounted. Do your patches already address this?

$ taskset -cp 3 $$ (attach to cpu 3)
pid 4591's current affinity list: 0-3
pid 4591's new affinity list: 3
$ echo 0 > /sys/devices/system/cpu/cpu2/online (down cpu 2)
(above command hangs)

Backtrace of pid 4591 (bash)

Call Trace:
[<a00000010078e990>] schedule+0x1210/0x13c0
sp=e0000060b6dffc90 bsp=e0000060b6df11e0
[<a00000010078ef60>] schedule_timeout+0x40/0x180
sp=e0000060b6dffce0 bsp=e0000060b6df11b0
[<a00000010078d3e0>] wait_for_common+0x240/0x3c0
sp=e0000060b6dffd10 bsp=e0000060b6df1180
[<a00000010078d760>] wait_for_completion+0x40/0x60
sp=e0000060b6dffd40 bsp=e0000060b6df1160
[<a000000100114ee0>] __stop_machine_run+0x120/0x160
sp=e0000060b6dffd40 bsp=e0000060b6df1120
[<a000000100765ae0>] _cpu_down+0x2a0/0x600
sp=e0000060b6dffd80 bsp=e0000060b6df10c8
[<a000000100765ea0>] cpu_down+0x60/0xa0
sp=e0000060b6dffe20 bsp=e0000060b6df10a0
[<a000000100768090>] store_online+0x50/0xe0
sp=e0000060b6dffe20 bsp=e0000060b6df1070
[<a0000001004f8800>] sysdev_store+0x60/0xa0
sp=e0000060b6dffe20 bsp=e0000060b6df1038
[<a00000010022e370>] sysfs_write_file+0x250/0x300
sp=e0000060b6dffe20 bsp=e0000060b6df0fe0
[<a00000010018a750>] vfs_write+0x1b0/0x300
sp=e0000060b6dffe20 bsp=e0000060b6df0f90
[<a00000010018b350>] sys_write+0x70/0xe0
sp=e0000060b6dffe20 bsp=e0000060b6df0f18
[<a00000010000af80>] ia64_ret_from_syscall+0x0/0x20
sp=e0000060b6dffe30 bsp=e0000060b6df0f18
[<a000000000010720>] ia64_ivt+0xffffffff00010720/0x400
sp=e0000060b6e00000 bsp=e0000060b6df0f18


I also got this to hang after doing this:
- taskset -cp 3 $$ (attach to cpu 3)
- echo 0 > /sys/devices/system/cpu/cpu2/online (down cpu 2, successful this time)
- echo 1 > /sys/devices/system/cpu/cpu2/online (up cpu 2, successful)
- taskset -p $$ (read cpumask, this command hangs)

Traceback here was:

Backtrace of pid 4653 (bash)

Call Trace:
[<a00000010078e990>] schedule+0x1210/0x13c0
sp=e0000060b78afab0 bsp=e0000060b78a1320
[<a00000010078ef60>] schedule_timeout+0x40/0x180
sp=e0000060b78afb00 bsp=e0000060b78a12f0
[<a00000010078d3e0>] wait_for_common+0x240/0x3c0
sp=e0000060b78afb30 bsp=e0000060b78a12c0
[<a00000010078d760>] wait_for_completion+0x40/0x60
sp=e0000060b78afb60 bsp=e0000060b78a12a0
[<a0000001000a63b0>] set_cpus_allowed_ptr+0x210/0x2a0
sp=e0000060b78afb60 bsp=e0000060b78a1270
[<a000000100786930>] cache_add_dev+0x970/0xbc0
sp=e0000060b78afbb0 bsp=e0000060b78a11d0
[<a000000100786c20>] cache_cpu_callback+0xa0/0x1e0
sp=e0000060b78afe10 bsp=e0000060b78a1190
[<a0000001000e96b0>] notifier_call_chain+0x50/0xe0
sp=e0000060b78afe10 bsp=e0000060b78a1148
[<a0000001000e9900>] __raw_notifier_call_chain+0x40/0x60
sp=e0000060b78afe10 bsp=e0000060b78a1108
[<a0000001000e9960>] raw_notifier_call_chain+0x40/0x60
sp=e0000060b78afe10 bsp=e0000060b78a10d8
[<a00000010078ba70>] cpu_up+0x2d0/0x380
sp=e0000060b78afe10 bsp=e0000060b78a10a0
[<a0000001007680b0>] store_online+0x70/0xe0
sp=e0000060b78afe20 bsp=e0000060b78a1070
[<a0000001004f8800>] sysdev_store+0x60/0xa0
sp=e0000060b78afe20 bsp=e0000060b78a1038
[<a00000010022e370>] sysfs_write_file+0x250/0x300
sp=e0000060b78afe20 bsp=e0000060b78a0fe0
[<a00000010018a750>] vfs_write+0x1b0/0x300
sp=e0000060b78afe20 bsp=e0000060b78a0f90
[<a00000010018b350>] sys_write+0x70/0xe0
sp=e0000060b78afe20 bsp=e0000060b78a0f18
[<a00000010000af80>] ia64_ret_from_syscall+0x0/0x20
sp=e0000060b78afe30 bsp=e0000060b78a0f18
[<a000000000010720>] ia64_ivt+0xffffffff00010720/0x400
sp=e0000060b78b0000 bsp=e0000060b78a0f18


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/