Re: [RFC][PATCH] Dynamic Tick: Allow 32-bit machines to sleepformorethan2.15 seconds
From: John Stultz
Date: Fri May 15 2009 - 21:18:58 EST
On Fri, 2009-05-15 at 11:35 -0500, Jon Hunter wrote:
> John Stultz wrote:
> >>>> One final question, I noticed in clocksource.h that the definition of
> >>>> function cyc2ns returns a type of s64, however, in the function itself a
> >>>> variable of type u64 is used and returned. Should this function be
> >>>> modified as follows?
> >>>>
> >>>> static inline s64 cyc2ns(struct clocksource *cs, cycle_t cycles)
> >>>> {
> >>>> - u64 ret = (u64)cycles;
> >>>> + s64 ret = (s64)cycles;
> >>>> ret = (ret * cs->mult) >> cs->shift;
> >>>> return ret;
> >>>> }
> >>> Damn. So this brings up an issue I had missed prior.
> >> Any comments on whether this should be u64 versus s64?
> >
> > I'd leave it alone for now. I'm concerns that in large multiplies, if
> > its a s64 the sign might get extended down by the shift. I need to look
> > at it in more detail though.
>
> I have been thinking about this some more and I do agree that there is a
> chance that the multiply could overflow if the "cycles" and "mult" are
> large. From the perspective of the timekeeping_max_deferment() function
> this would be very likely for 64-bit clocksources when the mask will be
> equal to (2^64)-1. Therefore, how about modifying the function as
> follows in order to catch any occurrences of overflow?
>
> Let me know if this is aligned with your thinking or if I am barking up
> the wrong tree here.
So cyc2ns is a very very hot path, so I don't think we want to muck with
that much.
Instead of catching the overflows, and then trying to handle them, we
really want to prevent overflows from happening. That is the main point
of the timekeeping_max_deferrment() interface after all ;)
So thinking about it a bit more, what we really want from
timekeeping_max_deferrment() is roughly:
/* find the max cycle value that would overflow the mult */
max_cycles = -1UUL/clocksource->mult;
/* pick the smaller of max_cycles or the mask value */
max_cycles = min(max_cycles, clocksource->mask);
/* convert max_cycles into ns */
max_time = cyc2ns(clocksource, max_cycles);
/* take off 5% of the max to make sure we don't show up late */
max_time = max_time - max_time/20;
We should be able to make this reasonably fast via:
max_cycles = 1<<(64 - ilog2(clocksource->mult) - 1);
max_cycles = min(max_cycles, clocksource->mask);
max_time = cyc2ns(clocksource, max_cycles);
max_time = max_time - max_time >> 4;
Does that seem reasonable?
thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/