Re: [RFC v4 2/6] sched/deadline: improve the tracking of active utilization
From: Juri Lelli
Date: Wed Jan 11 2017 - 12:05:07 EST
Hi,
On 30/12/16 12:33, Luca Abeni wrote:
> From: Luca Abeni <luca.abeni@xxxxxxxx>
>
> This patch implements a more theoretically sound algorithm for
> tracking active utilization: instead of decreasing it when a
> task blocks, use a timer (the "inactive timer", named after the
> "Inactive" task state of the GRUB algorithm) to decrease the
> active utilization at the so called "0-lag time".
>
> Signed-off-by: Luca Abeni <luca.abeni@xxxxxxxx>
> ---
[...]
> +static enum hrtimer_restart inactive_task_timer(struct hrtimer *timer)
> +{
> + struct sched_dl_entity *dl_se = container_of(timer,
> + struct sched_dl_entity,
> + inactive_timer);
> + struct task_struct *p = dl_task_of(dl_se);
> + struct rq_flags rf;
> + struct rq *rq;
> +
> + rq = task_rq_lock(p, &rf);
> +
> + if (!dl_task(p) || p->state == TASK_DEAD) {
> + if (p->state == TASK_DEAD && dl_se->dl_non_contending)
> + sub_running_bw(&p->dl, dl_rq_of_se(&p->dl));
> +
> + __dl_clear_params(p);
> +
> + goto unlock;
> + }
> + if (dl_se->dl_non_contending == 0)
> + goto unlock;
> +
> + sched_clock_tick();
> + update_rq_clock(rq);
> +
> + sub_running_bw(dl_se, &rq->dl);
> + dl_se->dl_non_contending = 0;
> +unlock:
> + task_rq_unlock(rq, p, &rf);
> + put_task_struct(p);
> +
> + return HRTIMER_NORESTART;
> +}
> +
[...]
> static void inc_dl_deadline(struct dl_rq *dl_rq, u64 deadline)
> @@ -934,7 +1014,28 @@ enqueue_dl_entity(struct sched_dl_entity *dl_se,
> if (flags & ENQUEUE_WAKEUP) {
> struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
>
> - add_running_bw(dl_se, dl_rq);
> + if (dl_se->dl_non_contending) {
> + /*
> + * If the timer handler is currently running and the
> + * timer cannot be cancelled, inactive_task_timer()
> + * will see that dl_not_contending is not set, and
> + * will do nothing, so we are still safe.
Here and below: the timer callback will actually put_task_struct() (see
above) if dl_not_contending is not set; that's why we don't need to do
that if try_to_cancel returned -1 (or 0). Saying "will do nothing" is a
bit misleading, IMHO.
> + */
> + if (hrtimer_try_to_cancel(&dl_se->inactive_timer) == 1)
> + put_task_struct(dl_task_of(dl_se));
> + WARN_ON(dl_task_of(dl_se)->nr_cpus_allowed > 1);
> + dl_se->dl_non_contending = 0;
> + } else {
[...]
> @@ -1097,6 +1198,22 @@ select_task_rq_dl(struct task_struct *p, int cpu, int sd_flag, int flags)
> }
> rcu_read_unlock();
>
> + rq = task_rq(p);
> + raw_spin_lock(&rq->lock);
> + if (p->dl.dl_non_contending) {
> + sub_running_bw(&p->dl, &rq->dl);
> + p->dl.dl_non_contending = 0;
> + /*
> + * If the timer handler is currently running and the
> + * timer cannot be cancelled, inactive_task_timer()
> + * will see that dl_not_contending is not set, and
> + * will do nothing, so we are still safe.
> + */
> + if (hrtimer_try_to_cancel(&p->dl.inactive_timer) == 1)
> + put_task_struct(p);
> + }
> + raw_spin_unlock(&rq->lock);
> +
> out:
> return cpu;
> }
We already raised the issue about having to lock the rq in
select_task_rq_dl() while reviewing the previous version; did you have
any thinking about possible solutions? Maybe simply bail out (need to
see how frequent this is however) or use an inner lock?
Best,
- Juri