[PATCH] sched/core: fix illegal RCU from offline CPUs

From: Qian Cai
Date: Sun Jan 12 2020 - 11:19:17 EST


In the CPU-offline process, it calls mmdrop() after idle entry and the
subsequent call to cpuhp_report_idle_dead(). Once execution passes the
call to rcu_report_dead(), RCU is ignoring the CPU, which results in
lockdep complaints when mmdrop() uses RCU from either memcg or
debugobjects. Fix it by scheduling mmdrop() on another online CPU.

=============================
WARNING: suspicious RCU usage
-----------------------------
kernel/workqueue.c:710 RCU or wq_pool_mutex should be held!

other info that might help us debug this:

RCU used illegally from offline CPU!
rcu_scheduler_active = 2, debug_locks = 1
2 locks held by swapper/37/0:
#0: c0000000010af608 (rcu_read_lock){....}, at:
percpu_ref_put_many+0x8/0x230
#1: c0000000010af608 (rcu_read_lock){....}, at:
__queue_work+0x7c/0xca0

stack backtrace:
Call Trace:
dump_stack+0xf4/0x164 (unreliable)
lockdep_rcu_suspicious+0x140/0x164
get_work_pool+0x110/0x150
__queue_work+0x1bc/0xca0
queue_work_on+0x114/0x120
css_release+0x9c/0xc0
percpu_ref_put_many+0x204/0x230
free_pcp_prepare+0x264/0x570
free_unref_page+0x38/0xf0
__mmdrop+0x21c/0x2c0
idle_task_exit+0x170/0x1b0
pnv_smp_cpu_kill_self+0x38/0x2e0
cpu_die+0x48/0x64
arch_cpu_idle_dead+0x30/0x50
do_idle+0x2f4/0x470
cpu_startup_entry+0x38/0x40
start_secondary+0x7a8/0xa80
start_secondary_resume+0x10/0x14

=============================
WARNING: suspicious RCU usage
-----------------------------
kernel/sched/core.c:562 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

RCU used illegally from offline CPU!
rcu_scheduler_active = 2, debug_locks = 1
2 locks held by swapper/94/0:
#0: c000201cc77dc118 (&base->lock){-.-.}, at:
lock_timer_base+0x114/0x1f0
#1: c0000000010af608 (rcu_read_lock){....}, at:
get_nohz_timer_target+0x3c/0x2d0

stack backtrace:
Call Trace:
dump_stack+0xf4/0x164 (unreliable)
lockdep_rcu_suspicious+0x140/0x164
get_nohz_timer_target+0x248/0x2d0
add_timer+0x24c/0x470
__queue_delayed_work+0x8c/0x110
queue_delayed_work_on+0x128/0x130
__debug_check_no_obj_freed+0x2ec/0x320
free_pcp_prepare+0x1b4/0x570
free_unref_page+0x38/0xf0
__mmdrop+0x21c/0x2c0
idle_task_exit+0x170/0x1b0
pnv_smp_cpu_kill_self+0x38/0x2e0
cpu_die+0x48/0x64
arch_cpu_idle_dead+0x30/0x50
do_idle+0x2f4/0x470
cpu_startup_entry+0x38/0x40
start_secondary+0x7a8/0xa80
start_secondary_prolog+0x10/0x14

Signed-off-by: Qian Cai <cai@xxxxxx>
---
kernel/sched/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 90e4b00ace89..41fb49f3dfce 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6194,7 +6194,8 @@ void idle_task_exit(void)
current->active_mm = &init_mm;
finish_arch_post_lock_switch();
}
- mmdrop(mm);
+ smp_call_function_single(cpumask_first(cpu_online_mask),
+ (void (*)(void *))mmdrop, mm, 0);
}

/*
--
2.21.0 (Apple Git-122.2)