Re: [PATCH] sched/cputime: Mitigate performance regression in times()/clock_gettime()
From: Peter Zijlstra
Date: Wed Aug 03 2016 - 06:41:32 EST
On Wed, Aug 03, 2016 at 12:04:52AM +0200, Giovanni Gherdovich wrote:
> > > +#if defined(CONFIG_FAIR_GROUP_SCHED)
> >
> > This here wants a comment on why we're doing this. Because I'm sure
> > that if someone were to read this code in a few weeks they'd go
> > WTF!?
>
> I had that config variable set in the machine I was testing on, and
> thought that for some reason it was related to my observations. I will
> repeat the experiment without it, and if I obtain the same results I
> will drop the conditional. Otherwise I will motivate its necessity.
>
No, I meant we want a comment here explaining the reason for these
prefetches.
You'll need the #ifdef because se->cfs_rq doesn't exist otherwise.
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2998,6 +2998,11 @@ unsigned long long task_sched_runtime(struct
> task_struct *p)
> * thread, breaking clock_gettime().
> */
> if (task_current(rq, p) && task_on_rq_queued(p)) {
> +#if defined(CONFIG_FAIR_GROUP_SCHED)
> + struct sched_entity *curr = (&p->se)->cfs_rq->curr;
> + prefetch(curr);
> + prefetch(&curr->exec_start);
> +#endif
> update_rq_clock(rq);
> p->sched_class->update_curr(rq);
> }
> -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8
>
> I post below the snippets of generated code with and without CSE that
> I got running 'disassemble /m task_sched_runtime' in gdb; you'll see
> they're identical. If you prefer the explicit hint I'll include it in
> v2, but it's probably safe to say it isn't needed.
I much prefer the manual CSE, its much more readable.
Also, maybe pull the whole thing into a helper function with a
descriptive name, like:
/*
* XXX comment on why this is needed goes here...
*/
static inline void prefetch_curr_exec_start(struct task_struct *p)
{
#ifdef CONFIG_FAIR_GROUP_SCHED
struct sched_entity *curr = (&p->se)->cfs_rq->curr;
prefetch(curr);
prefetch(&curr->exec_start);
#endif
}