Re: [PATCH 1/2] sched/deadline: add per rq tracking of admitted bandwidth

From: Peter Zijlstra
Date: Fri Feb 12 2016 - 12:05:40 EST


On Thu, Feb 11, 2016 at 05:10:12PM +0000, Juri Lelli wrote:
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 6368f43..1eccecf 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c

> +static void swap_task_ac_bw(struct task_struct *p,
> + struct rq *from,
> + struct rq *to)
> +{
> + unsigned long flags;
> +
> + lockdep_assert_held(&p->pi_lock);
> + local_irq_save(flags);
> + double_rq_lock(from, to);
> + __dl_sub_ac(from, p->dl.dl_bw);
> + __dl_add_ac(to, p->dl.dl_bw);
> + double_rq_unlock(from, to);
> + local_irq_restore(flags);
> +}

> +static void migrate_task_rq_dl(struct task_struct *p)
> +{
> + if (p->fallback_cpu != -1)
> + swap_task_ac_bw(p, task_rq(p), cpu_rq(p->fallback_cpu));
> +}

This patch scares me.

Now, my brain is having an awfully hard time trying to re-engage after
flu, but this looks very wrong.

So we call sched_class::migrate_task_rq() from set_task_cpu(), and we
call set_task_cpu() while potentially holding rq::lock's (try
push_dl_task() for kicks).

Sure, you play horrible games with fallback_cpu, but those games are
just that, horrible.


So your initial patch migrates the bandwidth along when a runnable task
gets moved about, this hack seems to be mostly about waking up. The
'normal' accounting is done on enqueue/dequeue, while here you use the
migration hook.

Having two separate means of accounting this also feels more fragile
than one would want.

Let me think a bit about this.