Re: [PATCH] Fix chance of sign extension to nsec after its msb is set during calculation.

From: John Stultz
Date: Wed Sep 21 2016 - 17:12:40 EST


On Sat, Sep 10, 2016 at 11:31 PM, Liav Rehana <liavr@xxxxxxxxxxxx> wrote:
>>> > > During the calculation of the nsec variable, "delta * tkr->mult"
>> >> > may cause overflow to the msb, if the suspended time is too long.
>> >> > In that case, we need to guarantee that the variable will not go
>> >> > through a sign extension during its shift, and thus it will result
>> >> > in a much higher value - close to the larget value of 64 bits.
>> >> > The following commit fixes this problem, which causes the following bug:
>> >> > Trying to connect through ftp to the os after a long enough
>> >> > suspended time will cause the nsec variable to get a much higher
>> >> > value after its shift because of sign extension, and thus the loop
>> >> > that follows some instructions afterwards, implemented in the
>> >> > inline function __iter_div_u64_rem, will take too long.
>> >> >
>> >> > Signed-off-by: Liav Rehana <liavr@xxxxxxxxxxxx>
>> >> > ---
>> >> > kernel/time/timekeeping.c | 2 +-
>> >> > 1 files changed, 1 insertions(+), 1 deletions(-)
>> >> >
>> >> > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
>> >> > index 479d25c..ddf56a5 100644
>> >> > --- a/kernel/time/timekeeping.c
>> >> > +++ b/kernel/time/timekeeping.c
>> >> > @@ -305,7 +305,7 @@ static inline s64 timekeeping_delta_to_ns(struct tk_read_base *tkr,
>> >> > s64 nsec;
>> >> >
>> >> > nsec = delta * tkr->mult + tkr->xtime_nsec;
>> >> > - nsec >>= tkr->shift;
>> >> > + nsec = ((u64) nsec) >> tkr->shift;
>> >>
>> >> This typecast is just a baindaid. What happens if you double the suspend time?
>> >> The multiplication will simply overflow. So the proper fix is to
>> >> sanity check delta and do multiple conversions if delta is big
>> >> enough. Preferrably this happens somewhere at the call site and not in this hotpath function.
>> >
>> > As a side note. John, why is that stuff unsigned at all? Shouldn't we
>> > use
>> > u64 for all of this?
>>
>>Errrr... My memory is quite foggy here. I think we just started way back when with s64 to catch cases where there were negative nsec intervals. Looking through the git logs it seems its > been that way since the beginning of the generic timekeeping logic.
>>
>>For most cases here u64 is fine. There may be a few cases where we do have valid negative nanosecond intervals, but I can't think of any off the top of my head, and those can >probably be special cased.
>
> In light of your comment for that issue, I would like to change the type of the nsec variable to u64, as it will solve the sign extension problem. For that, I intend to change the type
> of that variable in the functions that define it, and in the struct that uses it in kernel/time/timekeeping.c.
> Do you think there are other references I should change. Or do you think there is a better solution ? Please provide your opinion on this matter.

So no major objections, but when making the changes do keep an eye for
potential valid negative intervals. And if possible break it up into
smallish patches so its easier to review/audit.

But Thomas' point holds still, there are cases where if we get
intervals that are far too big, those need to be caught and called out
rather then just allowing unsigned overflow.

thanks
-john