RE: [PATCH] timekeeping: Change type of nsec variable to unsigned in its calculation.

From: Thomas Gleixner
Date: Tue Sep 27 2016 - 10:21:26 EST


On Tue, 27 Sep 2016, Liav Rehana wrote:
> -----Original Message-----
.....
>

Can you pretty please get rid of this --Original Message-- crap. There is
absolutely no value in copying the mail header into every reply.

> On Mon, 26 Sep 2016, Liav Rehana wrote:
> >> During the calculation of the nsec variable in the inline function
> >> timekeeping_delta_to_ns, it may undergo a sign extension if its msb is
> >> set just before the shift. The sign extension may, in some cases, gain
> >> it a value near the maximum value of the 64-bit range. This is bad
> >> when it is later used in a division function, such as
> >> __iter_div_u64_rem, where the amount of loops it will go through to
> >> calculate the division will be too large.
> >> The following commit fixes that chance of sign extension,
> >
> > Again. This does not fix anything, it papers over the underlying problem
> > that the calling code hands in a delta which is big enough to overflow
> > the multiplication into the negative space. >You just extend the range of
> > deltas which are handled gracefully, but that does not fix the underlying
> > problem as we still can run into the multiplication overflow. It won't
> > cause the result to be negative, but it will be crap nevertheless.
> >>
> >> while maintaining the type of the nsec variable as signed for other
> >> functions that use this variable, for possible legit negative time
> >> intervals.
>
> > What is this maintaining? The type of this variable changes to u64 and
> > other functions cannot use this variable at all because it's local to
> > that function. This sentence makes no sense at >all.
>
> About your first note,

Please answer inline right below my answer and not at some remote place
which makes the conversation hard to follow.

> I understand that it doesn't fix the overflow chance, but I'm not quite
> sure what can be done about that. If there was a code in the calling
> function that detects such event, what do you think can be done about it
> ? Would you just print a warning and lower delta as to not overflow after
> the multiplication ? If not, how do you think the problem I ran into can
> be fixed, if not by eliminating the possibility for sign extension the
> way I did ?

The problem you ran into is part of the overall problem. You got lucky that
it just expands into the negative space which can be "fixed" by making the
computation unsigned. I think I explained what needs to be done in some of
the previous mails and it's really not rocket science to deal with it.

> About the other note, I understood from you and John that there are some
> cases where negative time intervals are valid. What I meant by
> 'maintaining the type of the nsec variable as signed' was, that for the
> other functions that call the function I've changed, they do define a
> variable named nsec, and they define it as signed. In their
> implementation they assign him a value that is returned from the function
> I've changed. While the nsec variable is unsigned now in the function
> that calculates it, it can still return a value with an MSB that is set,
> which will be handled as negative in the caller function, where it was
> defined as signed. In that case, the change I added just removes the
> possibility of a sign extension, but the nsec variable will still be
> viewed as negative on the caller functions where it was supposed to
> return a negative value in the function I've changed.

This is complete and utter bullshit. How on earth would you end up with a
sensible result when delta is negative?

<Basic C course>

With nsec being unsigend the result after the shift is guaranteed to have
the N topmost bits to be 0, where N = tkr->shift. And that does not get
magically get the MSB set just because you assign the u64 to a s64 at the
call site.

</Basic C course>

Thanks,

tglx