Re: [RFC PATCH 2/7] sched/fair: Handle throttle path for task based throttle
From: Aaron Lu
Date: Fri Mar 14 2025 - 11:11:04 EST
On Fri, Mar 14, 2025 at 02:42:46PM +0530, K Prateek Nayak wrote:
> Hello Aaron,
>
> On 3/14/2025 2:27 PM, Aaron Lu wrote:
> > > >
> > > > +static inline bool task_has_throttle_work(struct task_struct *p)
> > > > +{
> > > > + return p->sched_throttle_work.next != &p->sched_throttle_work;
> > > > +}
> > > > +
> > > > +static inline void task_throttle_setup_work(struct task_struct *p)
> > > > +{
> > > > + /*
> > > > + * Kthreads and exiting tasks don't return to userspace, so adding the
> > > > + * work is pointless
> > > > + */
> > > > + if ((p->flags & (PF_EXITING | PF_KTHREAD)))
> > > > + return;
> > > > +
> > > > + if (task_has_throttle_work(p))
> > > > + return;
> > > > +
> > > > + task_work_add(p, &p->sched_throttle_work, TWA_RESUME);
> > > > +}
> > > > +
> > > > static int tg_throttle_down(struct task_group *tg, void *data)
> > > > {
> > > > struct rq *rq = data;
> > > > struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)];
> > > > + struct task_struct *p;
> > > > + struct rb_node *node;
> > > > +
> > > > + cfs_rq->throttle_count++;
> > > > + if (cfs_rq->throttle_count > 1)
> > > > + return 0;
> > >
> > > General question: Do we need the throttled_lb_pair() check in
> > > can_migrate_task() with the per-task throttle? Moving a throttled task
> > > to another CPU can ensures that the task can run quicker and exit to
> > > user space as quickly as possible and once the task dequeues, it will
> > > remove itself from the list of fair tasks making it unreachable for
> > > the load balancer. Thoughts?
> >
> > That's a good point.
> >
> > The current approach dequeued the task and removed it from rq's
> > cfs_tasks list, causing it lose the load balance opportunity. This is
> > pretty sad.
>
> That is fine. Today we have the throttled_lb_pair() check since tasks
> on throttled hierarchy remain on the fair tasks list and the load
> balancer should not move the around since they don't contribute to
> current load in throttled state and moving them around will not change
> anything since they'll still be waiting on another CPU for unthrottle.
OK I misunderstood. I thought tasks in throttled hierarchy still has
chance to participate load balance.
> With per-task throttle, we know that all the tasks on the fair task
> list are runnable (except for the delayed ones but they contribute to
> the load) - tasks on throttled hierarchy that exit to userspace will
> dequeue themselves, removing them from the fair task list too.
>
> Since a task that hasn't dequeued itself on a throttled hierarchy is
> runnable, I'm suggesting we get rid of the throttled_lb_pair() check
> in can_migrate_task() entirely.
Agree, will fix this in next version, thanks.
Best regards,
Aaron
> >
> > I'll need to think about this. I hope we can somehow keep the throttled
> > tasks in cfs_tasks list, I'll see how to make this happen.
> >
> > Thanks,
> > Aaron
> >
>
> [..snip..]
>
> --
> Thanks and Regards,
> Prateek
>