Re: [PATCH v3] sched/deadline: Fix bad accounting of nr_running

From: Juri Lelli
Date: Wed Feb 19 2014 - 08:14:52 EST


On 02/19/2014 11:32 AM, Juri Lelli wrote:
> On 02/19/2014 09:46 AM, Peter Zijlstra wrote:
>> On Tue, Feb 18, 2014 at 09:50:12PM -0500, Steven Rostedt wrote:
>>>
>>>> Rationale for this odd behavior is that, when a task is throttled, it
>>>> is removed only from the dl_rq, but we keep it on_rq (as this is not
>>>> a "full dequeue", that is the task is not actually sleeping). But, it
>>>> is also true that, while throttled a task behaves like it is sleeping
>>>> (e.g., its timer will fire on a new CPU if the old one is dead). So,
>>>> Steven's fix sounds also semantically correct.
>>>
>>> Actually, it seems that I was hitting it again, but this time getting a
>>> negative number. OK, after looking at the code a bit more, I think we
>>> should update the runqueue nr_running only when the task is officially
>>> enqueued and dequeued, and all accounting within, will not touch that
>>> number.
>
> This is a different way to get the same result (mildly tested on my box):
>
> ---
> kernel/sched/deadline.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 0dd5e09..675dad3 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -837,7 +837,8 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
> if (!task_current(rq, p) && p->nr_cpus_allowed > 1)
> enqueue_pushable_dl_task(rq, p);
>
> - inc_nr_running(rq);
> + if (!(flags & ENQUEUE_REPLENISH))
> + inc_nr_running(rq);
> }
>
> static void __dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags)
> --
>
> We touch nr_running only when we don't enqueue back as a consequence
> of a replenishment.
>
>>
>> But if the task is throttled it should still very much decrement the
>> number. There's places that very much rely on nr_running be exactly the
>> number of runnable tasks.
>>
>
> This is a different thing, and V2 seemed to implement this behavior
> (that's why I said it looked semantically correct).
>

So, both my last approach and Steven's V2 were causing nr_running to
become negative, as they double decrement it when dequeuing a task that
also exceeded its budget.

What follows seems to solve the issue, and correcly account for throttled
tasks as !nr_running.

---
kernel/sched/deadline.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 0dd5e09..b819577 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -717,6 +717,7 @@ void inc_dl_tasks(struct sched_dl_entity *dl_se, struct dl_rq *dl_rq)

WARN_ON(!dl_prio(prio));
dl_rq->dl_nr_running++;
+ inc_nr_running(rq_of_dl_rq(dl_rq));

inc_dl_deadline(dl_rq, deadline);
inc_dl_migration(dl_se, dl_rq);
@@ -730,6 +731,7 @@ void dec_dl_tasks(struct sched_dl_entity *dl_se, struct dl_rq *dl_rq)
WARN_ON(!dl_prio(prio));
WARN_ON(!dl_rq->dl_nr_running);
dl_rq->dl_nr_running--;
+ dec_nr_running(rq_of_dl_rq(dl_rq));

dec_dl_deadline(dl_rq, dl_se->deadline);
dec_dl_migration(dl_se, dl_rq);
@@ -836,8 +838,6 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)

if (!task_current(rq, p) && p->nr_cpus_allowed > 1)
enqueue_pushable_dl_task(rq, p);
-
- inc_nr_running(rq);
}

static void __dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags)
@@ -850,8 +850,6 @@ static void dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags)
{
update_curr_dl(rq);
__dequeue_task_dl(rq, p, flags);
-
- dec_nr_running(rq);
}

/*
--
1.7.9.5

Steven, could you test it?

Thanks,

- Juri
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/