Re: [PATCH] sched/fair: don't assign runtime for throttled cfs_rq

From: Valentin Schneider
Date: Fri Aug 23 2019 - 19:19:31 EST

On 23/08/2019 21:00, bsegall@xxxxxxxxxx wrote:
> Could you mention in the message that this a throttled cfs_rq can have
> account_cfs_rq_runtime called on it because it is throttled before
> idle_balance, and the idle_balance calls update_rq_clock to add time
> that is accounted to the task.

Mayhaps even a comment for the extra condition.

> I think this solution is less risky than unthrottling
> in this area, so other than that:
> Reviewed-by: Ben Segall <bsegall@xxxxxxxxxx>

If you don't mind squashing this in:

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b1d9cec9b1ed..b47b0bcf56bc 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4630,6 +4630,10 @@ static u64 distribute_cfs_runtime(struct cfs_bandwidth *cfs_b, u64 remaining)
if (!cfs_rq_throttled(cfs_rq))
goto next;

+ /* By the above check, this should never be true */
+ WARN_ON(cfs_rq->runtime_remaining > 0);
+ /* Pick the minimum amount to return to a positive quota state */
runtime = -cfs_rq->runtime_remaining + 1;
if (runtime > remaining)
runtime = remaining;

I'm not adamant about the extra comment, but the WARN_ON would be nice IMO.

@Ben, do you reckon we want to strap

Cc: <stable@xxxxxxxxxxxxxxx>
Fixes: ec12cb7f31e2 ("sched: Accumulate per-cfs_rq cpu usage and charge against bandwidth")

to the thing? AFAICT the pick_next_task_fair() + idle_balance() dance you
described should still be possible on that commit.

Other than that,

Reviewed-by: Valentin Schneider <valentin.schneider@xxxxxxx>