Re: [patch 13/20] timer: Switch to a non cascading wheel

From: Peter Zijlstra
Date: Tue Jun 14 2016 - 06:20:32 EST


On Tue, Jun 14, 2016 at 04:16:02AM -0400, George Spelvin wrote:
> While I like the cleanup of just limiting long-term resolution, if
> it turns out to be necessary, it's not too hard to add exact timers
> back in if a need is found in future. All you need is a second
> __internal_add_timer function that rounds down rather than up, and to
> teach expire_timers() to cascade in the unlikely situation that a timer
> does have an expiry time in the future.

That did occur to me as well; however I think it would be best to
eradicate all forms of cascading entirely -- if at all possible.

If not; then I agree, that would clean things up.

> Wouldn't this all be so much simpler as
>
> #define LVL_BITS 6 /* Renamed previous LVL_SHIFT */
> #define LVL_SIZE (1 << LVL_BITS)
> #define LVL_MASK (LVL_BITS - 1)
> #define LVL_OFFS(n) ((n) * LVL_SIZE)
> #define LVL_SHIFT(n) ((n) * LVL_CLK_SHIFT)
> #define LVL_GRAN(n) (1 << LVL_SHIFT(n))
>
> Then you could do
> +static inline unsigned calc_index(unsigned expires, unsigned level),
> +{
> + /* Round up to next bin bin */
> + expires = ((expires - 1) >> LVL_SHIFT(level)) + 1;
> + return LVL_OFFS(level) + (expires & LVL_MASK);
> +}

I like.

> to be replaced with __builtin_clz or similar:

Problem is for the archs that don't have that, the 5 layer branch is
trivial for all arches, while software clz/fls is far more expensive.

> > + timer = hlist_entry(head->first, struct timer_list, entry);
> > + fn = timer->function;
> > + data = timer->data;
> > +
> > + timer_stats_account_timer(timer);
> > +
> > + base->running_timer = timer;
> > + detach_expired_timer(timer);
>
> Is there some non-obvious reason that you have to fetch fn and data
> so early? It seems like a register pressure pessimization, if the
> compiler can't figure out that timer_stats code can't change them.
>
> The cache line containing this timer was already prefetched when you
> updated its entry.pprev as part of removing the previous entry from
> the list.
>
> I see why you want to fetch them with the lock held in case there's some
> freaky race, but I'd do it all after detach_timer().

Good point, ideally the compiler can move those loads around inside the
lock, but its unlikely to be _that_ clever. We could indeed lower those
loads manually to just before the unlock.