Re: native_smp_send_reschedule() splat from rt_mutex_lock()?

From: Sebastian Andrzej Siewior
Date: Wed Sep 20 2017 - 12:24:59 EST


On 2017-09-18 09:51:10 [-0700], Paul E. McKenney wrote:
> Hello!
Hi,

> [11072.586518] sched: Unexpected reschedule of offline CPU#6!
> [11072.587578] ------------[ cut here ]------------
> [11072.588563] WARNING: CPU: 0 PID: 59 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x37/0x40
> [11072.591543] Modules linked in:
> [11072.591543] CPU: 0 PID: 59 Comm: rcub/10 Not tainted 4.14.0-rc1+ #1
> [11072.610596] Call Trace:
> [11072.611531] resched_curr+0x61/0xd0
> [11072.611531] switched_to_rt+0x8f/0xa0
> [11072.612647] rt_mutex_setprio+0x25c/0x410
> [11072.613591] task_blocks_on_rt_mutex+0x1b3/0x1f0
> [11072.614601] rt_mutex_slowlock+0xa9/0x1e0
> [11072.615567] rt_mutex_lock+0x29/0x30
> [11072.615567] rcu_boost_kthread+0x127/0x3c0

> In theory, I could work around this by excluding CPU-hotplug operations
> while doing RCU priority boosting, but in practice I am very much hoping
> that there is a more reasonable solution out there...

so in CPUHP_TEARDOWN_CPU / take_cpu_down() / __cpu_disable() the CPU is
marked as offline and interrupt handling is disabled. Later in
CPUHP_AP_SCHED_STARTING / sched_cpu_dying() all tasks are migrated away.

Did this hit a random task during a CPU-hotplug operation which was not
yet migrated away from the dying CPU? In theory a futex_unlock() of a RT
task could also produce such a backtrace.

> Thanx, Paul
>

Sebastian