Re: [RFC PATCH v3 06/10] sched: Add over-utilization/tipping point indicator

From: Dietmar Eggemann
Date: Tue Jun 19 2018 - 06:26:48 EST


On 06/19/2018 09:01 AM, Pavan Kondeti wrote:
On Mon, May 21, 2018 at 03:25:01PM +0100, Quentin Perret wrote:

[...]

@@ -8152,6 +8176,9 @@ static inline void update_sg_lb_stats(struct lb_env *env,
if (nr_running > 1)
*overload = true;
+ if (cpu_overutilized(i))
+ *overutilized = 1;
+

There is no need to check if every CPU is overutilized or not once
*overutilized is marked as true, right?

True, so you want to check *overutilized before calling cpu_overutilized() to save a little bit on compute?

[...]

@@ -8586,6 +8621,10 @@ static struct sched_group *find_busiest_group(struct lb_env *env)
* this level.
*/
update_sd_lb_stats(env, &sds);
+
+ if (sched_energy_enabled() && !READ_ONCE(env->dst_rq->rd->overutilized))
+ goto out_balanced;
+

Is there any reason for sending no-hz idle kicks but bailing out here when
system is not overutilized?

Even if a system is not-overutilized, we want to update stale cpu blocked load and utilization so NOHZ_STATS_KICK have to get through.

So calling find_busiest_group() -> update_sd_lb_stats() -> update_sg_lb_stats() to possibly execute update_nohz_stats() is IMHO the right thing to do.