Re: [RFC][PATCH 12/13] sched/deadline: Introduce deadline servers

From: Dietmar Eggemann
Date: Wed Aug 07 2019 - 12:32:05 EST


On 7/26/19 4:54 PM, Peter Zijlstra wrote:
>
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>

[...]

> @@ -889,6 +891,8 @@ static void update_curr(struct cfs_rq *c
> trace_sched_stat_runtime(curtask, delta_exec, curr->vruntime);
> cgroup_account_cputime(curtask, delta_exec);
> account_group_exec_runtime(curtask, delta_exec);
> + if (curtask->server)
> + dl_server_update(curtask->server, delta_exec);
> }

I get a lockdep_assert_held(&rq->lock) related warning in start_dl_timer()
when running the full stack.

...
[ 0.530216] root domain span: 0-5 (max cpu_capacity = 1024)
[ 0.538655] devtmpfs: initialized
[ 0.556485] update_curr: rq mismatch rq[0] != rq[4]
[ 0.561519] update_curr: rq mismatch rq[0] != rq[4]
[ 0.566497] update_curr: rq mismatch rq[0] != rq[4]
[ 0.571443] update_curr: rq mismatch rq[0] != rq[4]
[ 0.576762] update_curr: rq mismatch rq[2] != rq[4]
[ 0.581674] update_curr: rq mismatch rq[2] != rq[4]
[ 0.586569] ------------[ cut here ]------------
[ 0.591220] WARNING: CPU: 2 PID: 2 at kernel/sched/deadline.c:916 start_dl_timer+0x160/0x178
[ 0.599686] Modules linked in:
[ 0.602756] CPU: 2 PID: 2 Comm: kthreadd Tainted: G W 5.3.0-rc3-00013-ga33cf033cc99-dirty #64
[ 0.612620] Hardware name: ARM Juno development board (r0) (DT)
[ 0.618560] pstate: 60000085 (nZCv daIf -PAN -UAO)
[ 0.623369] pc : start_dl_timer+0x160/0x178
[ 0.627572] lr : start_dl_timer+0x160/0x178
[ 0.631768] sp : ffff000010013cb0
...
[ 0.715075] Call trace:
[ 0.717531] start_dl_timer+0x160/0x178
[ 0.721382] update_curr_dl_se+0x108/0x208
[ 0.725494] dl_server_update+0x2c/0x38
[ 0.729348] update_curr+0x1b4/0x3b8
[ 0.732934] task_tick_fair+0x74/0xa88
[ 0.736698] scheduler_tick+0x94/0x110
[ 0.740461] update_process_times+0x48/0x60
...

Seems to be related to the fact that the rq can change:

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e4c14851a34c..5e3130a200ec 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -891,8 +891,17 @@ static void update_curr(struct cfs_rq *cfs_rq)
trace_sched_stat_runtime(curtask, delta_exec, curr->vruntime);
cgroup_account_cputime(curtask, delta_exec);
account_group_exec_runtime(curtask, delta_exec);
- if (curtask->server)
+ if (curtask->server) {
+ struct rq *rq = rq_of(cfs_rq);
+ struct rq *rq2 = curtask->server->rq;
+
+ if (rq != rq2) {
+ printk("update_curr: rq mismatch rq[%d] != rq[%d]\n",
+ cpu_of(rq), cpu_of(rq2));
+ }
+
dl_server_update(curtask->server, delta_exec);
+ }
}

...