Re: Which came first, hard kernel lockup or SATA errors?

From: Ed Swierk
Date: Tue Oct 10 2017 - 10:04:39 EST


Continuing the conversation with the voices in my head...

On Mon, Oct 9, 2017 at 10:45 PM, Ed Swierk <eswierk@xxxxxxxxxxxxxxxxxx> wrote:
> Based on the addresses in the stack and registers, here's what I think
> happened.
>
> On cpu 13:
>
> - task_numa_fault() calls task_numa_migrate(), which selects the task
> on cpu 0 as the dst_task.
> - migrate_swap() calls stop_two_cpus(), which acquires the cpu_stopper
> locks for the dst_cpu (cpu 0, at 0xffff881033fce600) and src_cpu
> (cpu X, at 0xffff8820341ce600).
> - stop_two_cpus() calls wake_up_process() on the lower-numbered cpu
> first, which has to be cpu 0.
> - wake_up_process() spins until the cpu 0 task (at 0xffff88102cc8dc00)
> is no longer on_cpu.
>
> On cpu 0:
>
> - pick_next_task_fair() calls idle_balance(). According to the "This
> is OK" comment, current is on_cpu at this point.
> - idle_balance() calls load_balance() for dst_cpu 0.
> - load_balance() decides to move a task from cpu X, so calls
> stop_one_cpu_nowait() on cpu X.
> - stop_one_cpu_nowait() spins trying to acquire the cpu_stopper lock
> for cpu X (at 0xffff8820341ce600).
>
> So idle_balance() on cpu 0 is stuck waiting for task_numa_fault() to
> move a task to cpu 0, which is blocked on idle_balance() completing.

Also, it appears that task_numa_fault() tries to migrate current, so
the src_cpu X used by task_numa_migrate() is cpu 13 in this
case. Though the key issue is that both task_numa_migrate() and
idle_balance() are trying to stop the same cpu, regardless of whether
it's the cpu task_numa_migrate() is running on.

So I'm wondering how this situation could be prevented.

Can task_numa_migrate() avoid picking a dst_task that might itself
try to stop either src_cpu or dst_cpu?

Or, can load_balance() avoid a cpu that might be stopped for migration
(or any other reason), or detect such a conflict and bail out rather
than spinning forever?

--Ed