Re: [PATCH] sched,tracing: Correct trace_sched_pi_setprio() for deboosting

From: Peter Zijlstra
Date: Wed May 23 2018 - 12:35:12 EST


On Wed, May 23, 2018 at 04:11:07PM +0200, Sebastian Andrzej Siewior wrote:

> Since that commit I see during a deboost a task this:
> |futex sched_pi_setprio: comm=futex_requeue_p pid=2234 oldprio=98 newprio=98
> |futex sched_switch: prev_comm=futex_requeue_p prev_pid=2234 prev_prio=120
>
> and after the revert, the `newprio' shows the correct value again:
>
> |futex sched_pi_setprio: comm=futex_requeue_p pid=2220 oldprio=98 newprio=120
> |futex sched_switch: prev_comm=futex_requeue_p prev_pid=2220 prev_prio=120

> @@ -435,7 +435,7 @@ TRACE_EVENT(sched_pi_setprio,
> memcpy(__entry->comm, tsk->comm, TASK_COMM_LEN);
> __entry->pid = tsk->pid;
> __entry->oldprio = tsk->prio;
> - __entry->newprio = pi_task ? pi_task->prio : tsk->prio;
> + __entry->newprio = new_prio;
> /* XXX SCHED_DEADLINE bits missing */
> ),
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 092f7c4de903..888df643b99b 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3823,7 +3823,7 @@ void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task)
> goto out_unlock;
> }
>
> - trace_sched_pi_setprio(p, pi_task);
> + trace_sched_pi_setprio(p, prio);

at this point:

prio = pi_task ? min(p->normal_prio, pi->task->prio) : p->normal_prio;

(aka __rt_effective_prio)

Should we put that in the tracepoint instead?

> oldprio = p->prio;
>
> if (oldprio == prio)