Re: [tip:sched/urgent] nohz/full, sched/rt: Fix missed tick-reenabling bug in sched_can_stop_tick()
From: Frederic Weisbecker
Date: Thu Apr 28 2016 - 09:30:27 EST
On Thu, Apr 28, 2016 at 03:24:43AM -0700, tip-bot for Peter Zijlstra wrote:
> Commit-ID: 2548d546d40c0014efdde88a53bf7896e917dcce
> Gitweb: http://git.kernel.org/tip/2548d546d40c0014efdde88a53bf7896e917dcce
> Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> AuthorDate: Thu, 21 Apr 2016 18:03:15 +0200
> Committer: Ingo Molnar <mingo@xxxxxxxxxx>
> CommitDate: Thu, 28 Apr 2016 10:28:55 +0200
>
> nohz/full, sched/rt: Fix missed tick-reenabling bug in sched_can_stop_tick()
>
> Chris Metcalf reported a that sched_can_stop_tick() sometimes fails to
> re-enable the tick.
>
> His observed problem is that rq->cfs.nr_running can be 1 even though
> there are multiple runnable CFS tasks. This happens in the cgroup
> case, in which case cfs.nr_running is the number of runnable entities
> for that level.
>
> If there is a single runnable cgroup (which can have an arbitrary
> number of runnable child entries itself) rq->cfs.nr_running will be 1.
>
> However, looking at that function I think there's more problems with it.
>
> It seems to assume that if there's FIFO tasks, those will run. This is
> incorrect. The FIFO task can have a lower prio than an RR task, in which
> case the RR task will run.
>
> So the whole fifo_nr_running test seems misplaced, it should go after
> the rr_nr_running tests. That is, only if !rr_nr_running, can we use
> fifo_nr_running like this.
Thanks for this patch. I indeed made confusions around SCHED_RR and SCHED_FIFO priorities.
Too late for me to ACK but I would have. Thanks!