Re: [PATCH v3 0/2] sched: Minor changes for rd->overload access
From: Ingo Molnar
Date: Fri Mar 29 2024 - 02:55:49 EST
* Shrikanth Hegde <sshegde@xxxxxxxxxxxxx> wrote:
>
>
> On 3/28/24 4:37 PM, Ingo Molnar wrote:
> >
> > * Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> >
> >> Plus I've applied a patch to rename ::overload to ::overloaded. It is
> >> silly to use an ambiguous noun instead of a clear adjective when naming
> >> such a flag ...
> >
> > Plus SG_OVERLOAD should be SG_OVERLOADED as well - it now looks in line
> > with SG_OVERUTILIZED:
> >
> > /* Scheduling group status flags */
> > #define SG_OVERLOADED 0x1 /* More than one runnable task on a CPU. */
> > #define SG_OVERUTILIZED 0x2 /* One or more CPUs are over-utilized. */
> >
> > My followup question is: why are these a bitmask, why not separate
> > flags?
> >
> > AFAICS we only ever set them separately:
> >
> > thule:~/tip> git grep SG_OVER kernel/sched/
> > kernel/sched/fair.c: set_rd_overutilized_status(rq->rd, SG_OVERUTILIZED);
> > kernel/sched/fair.c: *sg_status |= SG_OVERLOADED;
> > kernel/sched/fair.c: *sg_status |= SG_OVERUTILIZED;
> > kernel/sched/fair.c: *sg_status |= SG_OVERLOADED;
> > kernel/sched/fair.c: set_rd_overloaded(env->dst_rq->rd, sg_status & SG_OVERLOADED);
> > kernel/sched/fair.c: sg_status & SG_OVERUTILIZED);
> > kernel/sched/fair.c: } else if (sg_status & SG_OVERUTILIZED) {
> > kernel/sched/fair.c: set_rd_overutilized_status(env->dst_rq->rd, SG_OVERUTILIZED);
> > kernel/sched/sched.h:#define SG_OVERLOADED 0x1 /* More than one runnable task on a CPU. */
> > kernel/sched/sched.h:#define SG_OVERUTILIZED 0x2 /* One or more CPUs are over-utilized. */
> > kernel/sched/sched.h: set_rd_overloaded(rq->rd, SG_OVERLOADED);
> >
> > In fact this results in suboptimal code:
> >
> > /* update overload indicator if we are at root domain */
> > set_rd_overloaded(env->dst_rq->rd, sg_status & SG_OVERLOADED);
> >
> > /* Update over-utilization (tipping point, U >= 0) indicator */
> > set_rd_overutilized_status(env->dst_rq->rd,
> > sg_status & SG_OVERUTILIZED);
> >
> > Note how the bits that got mixed together in sg_status now have to be
> > masked out individually.
> >
> > The sg_status bitmask appears to make no sense at all to me.
> >
> > By turning these into individual bool flags we could also do away with
> > all the extra SG_OVERLOADED/SG_OVERUTILIZED abstraction.
> >
> > Ie. something like the patch below? Untested.
>
> Looks good. I see it is merged to sched/core.
> Did a boot with that patch and hackbench is showing same results 320 CPU system.
Thanks, I've added:
Acked-by: Shrikanth Hegde <sshegde@xxxxxxxxxxxxx>
Tested-by: Shrikanth Hegde <sshegde@xxxxxxxxxxxxx>
And applied the additional docbook fix below on top as well.
Thaks,
Ingo
=================>
kernel/sched/fair.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ebc8d5f855de..1dd37168da50 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9933,7 +9933,8 @@ sched_reduced_capacity(struct rq *rq, struct sched_domain *sd)
* @sds: Load-balancing data with statistics of the local group.
* @group: sched_group whose statistics are to be updated.
* @sgs: variable to hold the statistics for this group.
- * @sg_status: Holds flag indicating the status of the sched_group
+ * @sg_overloaded: sched_group is overloaded
+ * @sg_overutilized: sched_group is overutilized
*/
static inline void update_sg_lb_stats(struct lb_env *env,
struct sd_lb_stats *sds,