Re: [PATCH 1/4] sched/fair: Optimize sum computation with a lookup table

From: Yuyang Du
Date: Tue Apr 12 2016 - 21:50:25 EST


On Tue, Apr 12, 2016 at 11:14:13AM +0100, Juri Lelli wrote:
> On 12/04/16 03:12, Yuyang Du wrote:
> > On Mon, Apr 11, 2016 at 11:41:28AM +0100, Juri Lelli wrote:
> > > Hi,
> > >
> > > On 11/04/16 06:36, Yuyang Du wrote:
> > > > __compute_runnable_contrib() uses a loop to compute sum, whereas a
> > > > table loopup can do it faster in a constant time.
> > > >
> > > > The following python script can be used to generate the constants:
> > > >
> > > > print " #: yN_inv yN_sum"
> > > > print "-----------------------"
> > > > y = (0.5)**(1/32.0)
> > > > x = 2**32
> > > > xx = 1024
> > > > for i in range(0, 32):
> > > > if i == 0:
> > > > x = x-1
> > > > xx = xx*y
> > > > else:
> > > > x = x*y
> > > > xx = int(xx*y + 1024*y)
> > > > print "%2d: %#x %8d" % (i, int(x), int(xx))
> > > >
> > > > print " #: sum_N32"
> > > > print "------------"
> > > > xxx = xx
> > > > for i in range(0, 11):
> > > > if i == 0:
> > > > xxx = xx
> > > > else:
> > > > xxx = xxx/2 + xx
> > > > print "%2d: %8d" % (i, xxx)
> > > >
> > >
> > > Thanks for the script, really useful. Do you think there is value in
> > > making it general? Like if we want to play with/need changing LOAD_AVG_
> > > PERIOD in the future to something different than 32.
> >
> > i think a s/32/xx/ should work.
> >
> > > Also, does the following assume LOAD_AVG_PERIOD == 32? And if yes, do
> > > you think there is any value in removing that assumption?
> >
> > Like Peter said, we are heavily dependent on it already.
>
> But I think the current code should still work if we define LOAD_AVG_
> PERIOD as, say, 16 and we use Paul's program to recompute the tables.
>
> My point was about trying to keep everything related to LOAD_AVG_PERIOD
> and not start assuming it is 32. I'm not saying your changes assume
> that, I was asking if they do.

Oh, then my changes do not make more or less dependency. The entire avg thing
should only have two seeds (and all others depend on them):

(1) a period is 1024*1024ns
(2) a half-life is 32 periods

I'll check if there is anything hard-coded other than the two.