Re: sched: softlockups in multi_cpu_stop

From: Rafael David Tinoco
Date: Wed Mar 04 2015 - 00:44:35 EST


Some more info:

multi_cpu_stop seems to be spinning inside do { ... } while (curstate
!= MULTI_STOP_EXIT);

So, multi_cpu_stop is an offload ([migration]) for: migrate_swap ->
stop_two_cpus -> wait_for_completion() sequence... for cross-migrating
2 tasks.

Based on task structs from callers stacks:

PID 14990 CPU 05 -> PID 14996 CPU 00
PID 14991 CPU 30 -> PID 14998 CPU 01 (30 -> 1, different curr, same )
PID 14992 CPU 30 -> PID 14998 CPU 01 (30 -> 1, different curr)
PID 14996 CPU 00 -> PID 14992 CPU 30
PID 14998 CPU 01 -> PID 14990 CPU 05

RUNNING migration threads (cpu_stopper_thread -> multi_cpu_stop):

PID 118 = LAST CPU 09, CPU 09
PID 102 = LAST CPU 06, CPU 06
PID 143 = LAST CPU 14, CPU 14
PID 148 = LAST CPU 15, CPU 15
PID 153 = LAST CPU 16, CPU 16

### backtraces and task structs from stack

PID: 14990 TASK: ffff883e59e717f0 CPU: 5 COMMAND: "beam.smp"
#5 [ffff883e607edbc8] migrate_swap at ffffffff810987fa
ffff883e607edbd0: ffff883e59e717f0 ffff883e59e82fe0
ffff883e607edbe0: 000000100000000f ffff883e607edcc0
ffff883e607edbf0: ffffffff810a0827

ffff883e59e717f0 = task_struct -> pid 14990 (last cpu = 5)
ffff883e59e82fe0 = task_struct -> pid 14996 (last cpu = 0)

PID: 14991 TASK: ffff883e59e72fe0 CPU: 30 COMMAND: "beam.smp"
#5 [ffff883e59e0bbc8] migrate_swap at ffffffff810987fa
ffff883e59e0bbd0: ffff883e59e72fe0 ffff883e59e85fc0
ffff883e59e0bbe0: 000000060000000e ffff883e59e0bcc0
ffff883e59e0bbf0: ffffffff810a0827

ffff883e59e72fe0 = task_struct -> pid 14991 (last cpu = 30)
ffff883e59e85fc0 = task_struct -> pid 14998 (last cpu = 1)

PID: 14992 TASK: ffff883e59e747d0 CPU: 30 COMMAND: "beam.smp"
#5 [ffff883e59cadbc8] migrate_swap at ffffffff810987fa
ffff883e59cadbd0: ffff883e59e747d0 ffff883e59e85fc0
ffff883e59cadbe0: 0000000600000009 ffff883e59cadcc0
ffff883e59cadbf0: ffffffff810a0827

ffff883e59e747d0 = task_struct -> pid 14992 (last cpu = 30)
ffff883e59e85fc0 = task_struct -> pid 14998 (last cpu = 1)

PID: 14996 TASK: ffff883e59e82fe0 CPU: 0 COMMAND: "beam.smp"
#5 [ffff883f55d01bc8] migrate_swap at ffffffff810987fa
ffff883f55d01bd0: ffff883e59e82fe0 ffff883e59e747d0
ffff883f55d01be0: 0000000900000010 ffff883f55d01cc0
ffff883f55d01bf0: ffffffff810a0827

ffff883e59e82fe0 = task_struct -> pid 14996 (last cpu = 0)
ffff883e59e747d0 = task_struct -> pid 14992 (last cpu = 30)

PID: 14998 TASK: ffff883e59e85fc0 CPU: 1 COMMAND: "beam.smp"
#5 [ffff883e59e05bc8] migrate_swap at ffffffff810987fa
ffff883e59e05bd0: ffff883e59e85fc0 ffff883e59e717f0
ffff883e59e05be0: 0000000f00000006 ffff883e59e05cc0
ffff883e59e05bf0: ffffffff810a0827

ffff883e59e85fc0 = task_struct -> pid 14998 (last cpu = 1)
ffff883e59e717f0 = task_struct -> pid 14990 (last cpu = 5)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/