Re: [PATCH v5 1/4] sched/fair: add util_est on top of PELT

From: Peter Zijlstra
Date: Tue Mar 06 2018 - 13:59:06 EST


On Thu, Feb 22, 2018 at 05:01:50PM +0000, Patrick Bellasi wrote:
> +static inline void util_est_enqueue(struct cfs_rq *cfs_rq,
> + struct task_struct *p)
> +{
> + unsigned int enqueued;
> +
> + if (!sched_feat(UTIL_EST))
> + return;
> +
> + /* Update root cfs_rq's estimated utilization */
> + enqueued = READ_ONCE(cfs_rq->avg.util_est.enqueued);
> + enqueued += _task_util_est(p);
> + WRITE_ONCE(cfs_rq->avg.util_est.enqueued, enqueued);
> +}

> +static inline void util_est_dequeue(struct cfs_rq *cfs_rq,
> + struct task_struct *p,
> + bool task_sleep)
> +{
> + long last_ewma_diff;
> + struct util_est ue;
> +
> + if (!sched_feat(UTIL_EST))
> + return;
> +
> + /*
> + * Update root cfs_rq's estimated utilization
> + *
> + * If *p is the last task then the root cfs_rq's estimated utilization
> + * of a CPU is 0 by definition.
> + */
> + ue.enqueued = 0;
> + if (cfs_rq->nr_running) {
> + ue.enqueued = READ_ONCE(cfs_rq->avg.util_est.enqueued);
> + ue.enqueued -= min_t(unsigned int, ue.enqueued,
> + _task_util_est(p));
> + }
> + WRITE_ONCE(cfs_rq->avg.util_est.enqueued, ue.enqueued);

It appears to me this isn't a stable situation and completely relies on
the !nr_running case to recalibrate. If we ensure that doesn't happen
for a significant while the sum can run-away, right?

Should we put a max in enqueue to avoid this?