Re: [PATCH v4] sched: fix incorrect wait time and wait count statistics
From: Peter Zijlstra
Date: Fri Nov 06 2015 - 08:57:57 EST
On Tue, Oct 27, 2015 at 09:46:53PM -0700, Joonwoo Park wrote:
> @@ -1272,6 +1272,15 @@ void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
> WARN_ON_ONCE(p->state != TASK_RUNNING && p->state != TASK_WAKING &&
> !p->on_rq);
>
> + /*
> + * Migrating fair class task must have p->on_rq = TASK_ON_RQ_MIGRATING,
> + * because schedstat_wait_{start,end} rebase migrating task's wait_start
> + * time relying on p->on_rq.
> + */
> + WARN_ON_ONCE(p->state == TASK_RUNNING &&
> + p->sched_class == &fair_sched_class &&
> + (p->on_rq && !task_on_rq_migrating(p)));
> +
Why do we have to test p->on_rq? Would not ->state == RUNNING imply
that?
> +++ b/kernel/sched/fair.c
> @@ -737,41 +737,69 @@ static void update_curr_fair(struct rq *rq)
> update_curr(cfs_rq_of(&rq->curr->se));
> }
>
> +#ifdef CONFIG_SCHEDSTATS
> static inline void
> update_stats_wait_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
> {
> + u64 wait_start = rq_clock(rq_of(cfs_rq));
>
> + if (entity_is_task(se) && task_on_rq_migrating(task_of(se)) &&
> + likely(wait_start > se->statistics.wait_start))
> + wait_start -= se->statistics.wait_start;
> +
> + schedstat_set(se->statistics.wait_start, wait_start);
> }
>
> static void
> update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
> {
Since this is now all under CONFIG_SCHEDSTAT, would it not make sense
to do something like:
u64 now = rq_clock(rq_of(cfs_rq));
to avoid the endless calling of that function?
Also, for that very same reason; would it not make sense to drop the
schedstat_set() usage below, that would greatly enhance readability.
> + if (entity_is_task(se) && task_on_rq_migrating(task_of(se))) {
> + /*
> + * Preserve migrating task's wait time so wait_start time stamp
> + * can be adjusted to accumulate wait time prior to migration.
> + */
> + schedstat_set(se->statistics.wait_start,
> + rq_clock(rq_of(cfs_rq)) -
> + se->statistics.wait_start);
> + return;
> + }
> +
> schedstat_set(se->statistics.wait_max, max(se->statistics.wait_max,
> rq_clock(rq_of(cfs_rq)) - se->statistics.wait_start));
> schedstat_set(se->statistics.wait_count, se->statistics.wait_count + 1);
> schedstat_set(se->statistics.wait_sum, se->statistics.wait_sum +
> rq_clock(rq_of(cfs_rq)) - se->statistics.wait_start);
> +
> if (entity_is_task(se)) {
> trace_sched_stat_wait(task_of(se),
> rq_clock(rq_of(cfs_rq)) - se->statistics.wait_start);
> }
Is there no means of collapsing the two 'entity_is_task()' branches?
> schedstat_set(se->statistics.wait_start, 0);
> }
> +#else
> +static inline void
> +update_stats_wait_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
> +{
> +}
> +
> +static inline void
> +update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
> +{
> +}
> +#endif
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/