Re: [PATCH] sched/loadavg: Avoid loadavg spikes caused by delayed NO_HZ accounting

From: Matt Fleming
Date: Wed Feb 15 2017 - 11:16:27 EST


On Wed, 15 Feb, at 04:12:11PM, Frederic Weisbecker wrote:
> On Wed, Feb 08, 2017 at 01:29:24PM +0000, Matt Fleming wrote:
> > The calculation for the next sample window when exiting NOH_HZ idle
> > does not handle the fact that we may not have reached the next sample
> > window yet
>
> That sentence is hard to parse, it took me some time to figure out that
> those two "next sample window" may not refer to the same thing.

Yeah, it's not the most lucid thing I've ever written.

> Maybe it would be clearer with something along the lines of:
>
> "The calculation for the next sample window when exiting NO_HZ
> does not handle the fact that we may not have crossed any sample
> window during the NO_HZ period."

Umm... this isn't the problem. In fact, it's the opposite.

The problem is that if we *did* cross a sample window while in NO_HZ,
then when we exit the pending window may be far enough into the future
that all we need to do is update this_rq->calc_load_update.

> > If we wake from NO_HZ idle after the pending this_rq->calc_load_update
> > window time when we want idle but before the next sample window
>
> That too was hard to understand. How about:
>
> "If we enter in NO_HZ mode after a pending this_rq->calc_load_update
> and we exit from NO_HZ mode before the forthcoming sample window, ..."

You've got this backwards again. We enter NO_HZ before the pending
window, not after.