Re: [External] Re: [RFC PATCH 7/7] sched/fair: Make sure cfs_rq has enough runtime_remaining on unthrottle path

From: Aaron Lu
Date: Fri Mar 14 2025 - 07:39:59 EST


On Fri, Mar 14, 2025 at 09:48:00AM +0530, K Prateek Nayak wrote:
> Hello Aaron,
>
> On 3/13/2025 12:52 PM, Aaron Lu wrote:
> > It's possible unthrottle_cfs_rq() is called with !runtime_remaining
> > due to things like user changed quota setting(see tg_set_cfs_bandwidth())
> > or async unthrottled us with a positive runtime_remaining but other still
> > running entities consumed those runtime before we reach there.
> >
> > Anyway, we can't unthrottle this cfs_rq without any runtime remaining
> > because task enqueue during unthrottle can immediately trigger a throttle
> > by check_enqueue_throttle(), which should never happen.
> >
> > Signed-off-by: Aaron Lu <ziqianlu@xxxxxxxxxxxxx>
> > ---
> > kernel/sched/fair.c | 13 +++++++++++++
> > 1 file changed, 13 insertions(+)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index be96f7d32998c..d646451d617c1 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -6058,6 +6058,19 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
> > struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg);
> > struct sched_entity *se = cfs_rq->tg->se[cpu_of(rq)];
> >
> > + /*
> > + * It's possible we are called with !runtime_remaining due to things
> > + * like user changed quota setting(see tg_set_cfs_bandwidth()) or async
> > + * unthrottled us with a positive runtime_remaining but other still
> > + * running entities consumed those runtime before we reach here.
> > + *
> > + * Anyway, we can't unthrottle this cfs_rq without any runtime remaining
> > + * because any enqueue below will immediately trigger a throttle, which
> > + * is not supposed to happen on unthrottle path.
> > + */
> > + if (cfs_rq->runtime_enabled && !cfs_rq->runtime_remaining)
>
> Should this be "cfs_rq->runtime_remaining <= 0" since slack could have
> built up by that time we come here?

Absolutely!
Thanks for pointing this out.

Best regards,
Aaron

> > + return;
> > +
> > cfs_rq->throttled = 0;
> >
> > update_rq_clock(rq);
>
> --
> Thanks and Regards,
> Prateek
>