Re: [PATCH] Fix chance of sign extension to nsec after its msb is set during calculation.
From: John Stultz
Date: Tue Sep 06 2016 - 23:22:26 EST
On Fri, Sep 2, 2016 at 11:30 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Fri, 2 Sep 2016, Thomas Gleixner wrote:
>> On Thu, 1 Sep 2016, Liav Rehana wrote:
>> > From: Liav Rehana <liavr@xxxxxxxxxxxx>
>> >
>> > During the calculation of the nsec variable, "delta * tkr->mult" may cause
>> > overflow to the msb, if the suspended time is too long.
>> > In that case, we need to guarantee that the variable will not go through a
>> > sign extension during its shift, and thus it will result in a much higher
>> > value - close to the larget value of 64 bits.
>> > The following commit fixes this problem, which causes the following bug:
>> > Trying to connect through ftp to the os after a long enough suspended time
>> > will cause the nsec variable to get a much higher value after its shift
>> > because of sign extension, and thus the loop that follows some instructions
>> > afterwards, implemented in the inline function __iter_div_u64_rem, will
>> > take too long.
>> >
>> > Signed-off-by: Liav Rehana <liavr@xxxxxxxxxxxx>
>> > ---
>> > kernel/time/timekeeping.c | 2 +-
>> > 1 files changed, 1 insertions(+), 1 deletions(-)
>> >
>> > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
>> > index 479d25c..ddf56a5 100644
>> > --- a/kernel/time/timekeeping.c
>> > +++ b/kernel/time/timekeeping.c
>> > @@ -305,7 +305,7 @@ static inline s64 timekeeping_delta_to_ns(struct tk_read_base *tkr,
>> > s64 nsec;
>> >
>> > nsec = delta * tkr->mult + tkr->xtime_nsec;
>> > - nsec >>= tkr->shift;
>> > + nsec = ((u64) nsec) >> tkr->shift;
>>
>> This typecast is just a baindaid. What happens if you double the suspend time?
>> The multiplication will simply overflow. So the proper fix is to sanity check
>> delta and do multiple conversions if delta is big enough. Preferrably this
>> happens somewhere at the call site and not in this hotpath function.
>
> As a side note. John, why is that stuff unsigned at all? Shouldn't we use
> u64 for all of this?
Errrr... My memory is quite foggy here. I think we just started way
back when with s64 to catch cases where there were negative nsec
intervals. Looking through the git logs it seems its been that way
since the beginning of the generic timekeeping logic.
For most cases here u64 is fine. There may be a few cases where we do
have valid negative nanosecond intervals, but I can't think of any off
the top of my head, and those can probably be special cased.
thanks
-john