Re: [PATCH] timers: Reconcile the code and the comment for the 250HZ case

From: Thomas Gleixner
Date: Mon Jan 23 2017 - 06:11:05 EST


On Sat, 21 Jan 2017, Zhihui Zhang wrote:

> Sure, I believe that comments should always match the code. In this

That's fine.

> case, using either LVL_SIZE - 1 or LVL_SIZE is fine based on my
> understanding about 20 days ago. But I could be wrong and miss some
> subtle details. Anyway, my point is about readability.

Well, readability is one thing, but correctness is more important, right?

Let's assume we have 4 buckets per level and base->clk is 0. So Level 0
has the following expiry times:

Bucket 0: base->clk + 0
Bucket 1: base->clk + 1
Bucket 2: base->clk + 2
Bucket 3: base->clk + 3

So we can accomodate 4 timers here, but there is a nifty detail. We
guarantee that expiries are not short, so a timer armed for base->clk
will expire at base->clk + 1.

The reason for this is that we have no distinction between absolute and
relative timeouts. But for relative timeouts we have to guarantee that the
timeout does not expire before the number of jiffies has elapsed.

Now a timer armed with 1 jiffie relativ to now (jiffies) cannot be queued
to bucket 0 because jiffies can be incremented immediately after queueing
the timer which would expire it early. So it's queued to bucket 1 and
that's why we need to have LVL_SIZE - 1 and not LVL_SIZE. See also
calc_index().

Your change completely breaks the wheel. Let's assume the above and a
timer expiring at base->clk + 3.

With your change the timer would fall into Level 0. So no calc_index()
does:

expires = (expires + LVL_GRAN(lvl)) >> LVL_SHIFT(lvl);
return LVL_OFFS(lvl) + (expires & LVL_MASK);

Let's substitute that for the expires = base->clk + 3 case:

expires = (base->clk + 3 + 1) >> 0;

---> expires = 4;

return 0 + (4 & 0x03);

---> index = 0

So the timer gets queued into bucket 0 and expires 4 jiffies too early.

So using either LVL_SIZE - 1 or LVL_SIZE is _NOT_ fine.

Thanks,

tglx