Re: [PATCH] time: Avoid signed overflow in timekeeping_delta_to_ns()

From: John Stultz
Date: Tue Nov 15 2016 - 17:03:43 EST


On Tue, Nov 15, 2016 at 1:53 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Mon, 14 Nov 2016, John Stultz wrote:
>
>> On Mon, Nov 14, 2016 at 11:42 AM, Chris Metcalf <cmetcalf@xxxxxxxxxxxx> wrote:
>> > This bugfix was originally made in commit 35a4933a8959 ("time:
>> > Avoid signed overflow in timekeeping_get_ns()"). When the code was
>> > refactored in commit 6bd58f09e1d8 ("time: Add cycles to nanoseconds
>> > translation") the signed overflow fix was lost. Re-introduce it.
>> >
>> > Signed-off-by: Chris Metcalf <cmetcalf@xxxxxxxxxxxx>
>> > ---
>> > I happened to be looking for an unrelated fix, found this code,
>> > realized the tip code didn't match the fixed code, and
>> > backtracked to where it had gone away.
>> >
>> > kernel/time/timekeeping.c | 3 +--
>> > 1 file changed, 1 insertion(+), 2 deletions(-)
>> >
>> > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
>> > index 37dec7e3db43..57926bc7b7f3 100644
>> > --- a/kernel/time/timekeeping.c
>> > +++ b/kernel/time/timekeeping.c
>> > @@ -304,8 +304,7 @@ static inline s64 timekeeping_delta_to_ns(struct tk_read_base *tkr,
>> > {
>> > s64 nsec;
>> >
>> > - nsec = delta * tkr->mult + tkr->xtime_nsec;
>> > - nsec >>= tkr->shift;
>> > + nsec = (delta * tkr->mult + tkr->xtime_nsec) >> tkr->shift;
>>
>> Ugh.
>>
>> So... I think this proves the original fix was *far* too subtle to
>> maintain. So I think reintroducing it as-is doesn't protect us from
>> undoing it. If the problem is really using the signed intermediate
>> nsec value, we should get rid of that.
>
> As I told the other guy who submitted something similar: This is not really
> helpful. It merily drags the overflow case out by a factor of 2.

Well... So lost time (where a VM/gdb caused stall runs past the
clocksource or causes an mult overflow) is a bit less problematic then
getting a huge negative nsec value.

> So we really need to figure out under which circumstances this can happen
> and fixup either the callsites or detect the condition right there, which
> I'd like to avoid for the hotpath.

I get that catching the (delta > TOOBIG) case, but even then I'm not
sure how we deal that condition in a way that results in anything
meaningfully different from the less-problematic unsigned overflow
(ie, capping it).

thanks
-john