Re: [PATCH v4] sched: fix incorrect wait time and wait count statistics
From: Joonwoo Park
Date: Fri Nov 06 2015 - 21:41:52 EST
On Fri, Nov 06, 2015 at 02:57:49PM +0100, Peter Zijlstra wrote:
> On Tue, Oct 27, 2015 at 09:46:53PM -0700, Joonwoo Park wrote:
> > @@ -1272,6 +1272,15 @@ void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
> > WARN_ON_ONCE(p->state != TASK_RUNNING && p->state != TASK_WAKING &&
> > !p->on_rq);
> >
> > + /*
> > + * Migrating fair class task must have p->on_rq = TASK_ON_RQ_MIGRATING,
> > + * because schedstat_wait_{start,end} rebase migrating task's wait_start
> > + * time relying on p->on_rq.
> > + */
> > + WARN_ON_ONCE(p->state == TASK_RUNNING &&
> > + p->sched_class == &fair_sched_class &&
> > + (p->on_rq && !task_on_rq_migrating(p)));
> > +
>
> Why do we have to test p->on_rq? Would not ->state == RUNNING imply
> that?
>
sched_fork() sets p->state = RUNNING before changing task cpu.
Please let me know if you got better idea.
> > +++ b/kernel/sched/fair.c
> > @@ -737,41 +737,69 @@ static void update_curr_fair(struct rq *rq)
> > update_curr(cfs_rq_of(&rq->curr->se));
> > }
> >
> > +#ifdef CONFIG_SCHEDSTATS
> > static inline void
> > update_stats_wait_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > {
> > + u64 wait_start = rq_clock(rq_of(cfs_rq));
> >
> > + if (entity_is_task(se) && task_on_rq_migrating(task_of(se)) &&
> > + likely(wait_start > se->statistics.wait_start))
> > + wait_start -= se->statistics.wait_start;
> > +
> > + schedstat_set(se->statistics.wait_start, wait_start);
> > }
> >
> > static void
> > update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > {
>
> Since this is now all under CONFIG_SCHEDSTAT, would it not make sense
> to do something like:
>
> u64 now = rq_clock(rq_of(cfs_rq));
>
> to avoid the endless calling of that function?
>
> Also, for that very same reason; would it not make sense to drop the
> schedstat_set() usage below, that would greatly enhance readability.
>
Agreed.
> > + if (entity_is_task(se) && task_on_rq_migrating(task_of(se))) {
> > + /*
> > + * Preserve migrating task's wait time so wait_start time stamp
> > + * can be adjusted to accumulate wait time prior to migration.
> > + */
> > + schedstat_set(se->statistics.wait_start,
> > + rq_clock(rq_of(cfs_rq)) -
> > + se->statistics.wait_start);
> > + return;
> > + }
> > +
> > schedstat_set(se->statistics.wait_max, max(se->statistics.wait_max,
> > rq_clock(rq_of(cfs_rq)) - se->statistics.wait_start));
> > schedstat_set(se->statistics.wait_count, se->statistics.wait_count + 1);
> > schedstat_set(se->statistics.wait_sum, se->statistics.wait_sum +
> > rq_clock(rq_of(cfs_rq)) - se->statistics.wait_start);
> > +
> > if (entity_is_task(se)) {
> > trace_sched_stat_wait(task_of(se),
> > rq_clock(rq_of(cfs_rq)) - se->statistics.wait_start);
> > }
>
> Is there no means of collapsing the two 'entity_is_task()' branches?
>
Agreed. Will spin v5 with these clean up.
Thanks,
Joonwoo
> > schedstat_set(se->statistics.wait_start, 0);
> > }
> > +#else
> > +static inline void
> > +update_stats_wait_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > +{
> > +}
> > +
> > +static inline void
> > +update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > +{
> > +}
> > +#endif
--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/