Re: Ensuring wall_to_monotonic is not positive breaks use case

From: Thomas Gleixner
Date: Sat Sep 15 2018 - 11:06:49 EST

On Thu, 6 Sep 2018, Thomas Gleixner wrote:
> On Wed, 5 Sep 2018, Rick Ratzel wrote:
> > We're looking for suggestions on how best to proceed with a new change
> > that ideally both supports the use case described above, as well as
> > addresses the symptoms brought up in the initial commit (negative boot
> > time causes get_expiry() to overflow time_t, and show_stat() uses
> > "unsigned long" to print negative btime). Any thoughts on this would be
> > greatly appreciated.
> Those symptoms are just the tip of the iceberg. For sure it screws up
> everything around boot time and a lot of things use boottime nowadays.

I had a second look and actually it's not that bad. My brain snapped on
boot time and we actually have two variants of boot time. One is the
monotonic time since boot aka CLOCK_BOOTTIME and the other one is the wall
clock time when the system was booted.

> So reverting this is not really an option.

Maybe we can :)

We have 3 users in tree:

1) /proc/stat btime

It's trivial enough to clamp that value to 0, though it might be
surprising for some users as uptime will tell something different.

2) sunrpc

I think that can be converted to actually use CLOCK_BOOTTIME. This needs
a bit of trickery to preserve the user space interfaces which are
CLOCK_REALTIME based, but it should be doable.

3) x86/kvm

That's actually the trickiest part of all and I haven't yet fully
analyzed it, whether there is an issue that time can go negative.

Actually there shouldn't be one as this is about wall time and that
cannot go before the epoch. Needs some close inspection and while it
might be trivial to adapt the code in question, it could be tricky or
even impossible to fix it without breaking existing guests.