Re: [PATCH v3] sched/fair: don't assign runtime for throttled cfs_rq

From: Peter Zijlstra
Date: Wed Aug 28 2019 - 07:41:48 EST


On Wed, Aug 28, 2019 at 11:16:52AM +0100, Valentin Schneider wrote:
> On 26/08/2019 13:16, Liangyan wrote:
> > do_sched_cfs_period_timer() will refill cfs_b runtime and call
> > distribute_cfs_runtime to unthrottle cfs_rq, sometimes cfs_b->runtime
> > will allocate all quota to one cfs_rq incorrectly, then other cfs_rqs
> > attached to this cfs_b can't get runtime and will be throttled.
> >
> > We find that one throttled cfs_rq has non-negative
> > cfs_rq->runtime_remaining and cause an unexpetced cast from s64 to u64
> > in snippet: distribute_cfs_runtime() {
> > runtime = -cfs_rq->runtime_remaining + 1; }.
> > The runtime here will change to a large number and consume all
> > cfs_b->runtime in this cfs_b period.
> >
> > According to Ben Segall, the throttled cfs_rq can have
> > account_cfs_rq_runtime called on it because it is throttled before
> > idle_balance, and the idle_balance calls update_rq_clock to add time
> > that is accounted to the task.
> >
> > This commit prevents cfs_rq to be assgined new runtime if it has been
> > throttled until that distribute_cfs_runtime is called.
> >
> > Signed-off-by: Liangyan <liangyan.peng@xxxxxxxxxxxxxxxxx>
> > Reviewed-by: Ben Segall <bsegall@xxxxxxxxxx>
> > Reviewed-by: Valentin Schneider <valentin.schneider@xxxxxxx>
>
> @Peter/Ingo, if we care about it I believe it can't hurt to strap
>
> Cc: <stable@xxxxxxxxxxxxxxx>
> Fixes: d3d9dc330236 ("sched: Throttle entities exceeding their allowed bandwidth")
>
> to the thing.

OK, done.