[PATCH] sched/deadline: Fix lock pinning warning during cpu hotplug

From: Wanpeng Li
Date: Wed Aug 03 2016 - 21:42:32 EST


From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>

WARNING: CPU: 0 PID: 0 at kernel/locking/lockdep.c:3531 lock_release+0x690/0x6a0
releasing a pinned lock
Call Trace:
dump_stack+0x99/0xd0
__warn+0xd1/0xf0
? dl_task_timer+0x1a1/0x2b0
warn_slowpath_fmt+0x4f/0x60
? sched_clock+0x13/0x20
lock_release+0x690/0x6a0
? enqueue_pushable_dl_task+0x9b/0xa0
? enqueue_task_dl+0x1ca/0x480
_raw_spin_unlock+0x1f/0x40
dl_task_timer+0x1a1/0x2b0
? push_dl_task.part.31+0x190/0x190
WARNING: CPU: 0 PID: 0 at kernel/locking/lockdep.c:3649 lock_unpin_lock+0x181/0x1a0
unpinning an unpinned lock
Call Trace:
dump_stack+0x99/0xd0
__warn+0xd1/0xf0
warn_slowpath_fmt+0x4f/0x60
lock_unpin_lock+0x181/0x1a0
dl_task_timer+0x127/0x2b0
? push_dl_task.part.31+0x190/0x190

This can be triggered by hot-unplug the cpu which dl task is running on.

DL task will be migrated to a most suitable later deadline rq or fallback
to any eligible online CPU after dl timer fires if current rq is offline.
We need to hold rq lock of both rqs to confirm there is nothing changed
and compare the task deadline and the earliest_dl deadline of the destination
cpu in the progress of finding suitable destination cpu.

The rq lock of offline cpu will be unlocked in _double_lock_balance and then
reacquire both rq locks in a fair way, the rq lock of offline cpu is held and
lockdep pin before, however, it will be unlocked in _double_lock_balance directly
w/o lockdep unpin, which triggers the "releasing a pinned lock" warning above.

Queue the task to the destination cpu might have overloaded rq, so the current
implementation will check if we need to kick some away, rq lock will be lockdep
unpin since push_dl_task() will release the rq lock and reacquire it. However,
if there is cpu hot-unplug and dl task migration, the rq lock is lockdep unpin,
so it will trigger the "unpinning an unpinned lock" above.

This patch fix it by lockdep unpin the offline cpu before task migration during
cpu hotplug handling and lockdep pin destination cpu after this handling since
the rq lock is held during this handling.

Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Juri Lelli <juri.lelli@xxxxxxx>
Cc: Luca Abeni <luca.abeni@xxxxxxxx>
Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
---
kernel/sched/deadline.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index fcb7f02..1ce8867 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -658,8 +658,11 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
*
* XXX figure out if select_task_rq_dl() deals with offline cpus.
*/
- if (unlikely(!rq->online))
+ if (unlikely(!rq->online)) {
+ lockdep_unpin_lock(&rq->lock, rf.cookie);
rq = dl_task_offline_migration(rq, p);
+ rf.cookie = lockdep_pin_lock(&rq->lock);
+ }

/*
* Queueing this task back might have overloaded rq, check if we need
--
1.9.1