Re: [patch 00/11] x86/vdso: Cleanups, simmplifications and CLOCK_TAI support

From: Andy Lutomirski
Date: Thu Oct 04 2018 - 10:00:51 EST




> On Oct 4, 2018, at 1:11 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
>> On Thu, Oct 04, 2018 at 09:54:45AM +0200, Vitaly Kuznetsov wrote:
>> I was hoping to hear this from you :-) If I am to suggest how we can
>> move forward I'd propose:
>> - Check if pure TSC can be used on SkyLake+ systems (where TSC scaling
>> is supported).
>> - Check if non-masterclock mode is still needed. E.g. HyperV's TSC page
>> clocksource is a single page for the whole VM, not a per-cpu thing. Can
>> we think that all the buggy hardware is already gone?
>
> No, and it is not the hardware you have to worry about (mostly), it is
> the frigging PoS firmware people put on it.
>
> Ever since Nehalem TSC is stable (unless you get to >4 socket systems,
> after which it still can be, but bets are off). But even relatively
> recent systems fail the TSC sync test because firmware messes it up by
> writing to either MSR_TSC or MSR_TSC_ADJUST.
>
> But the thing is, if the TSC is not synced, you cannot use it for
> timekeeping, full stop. So having a single page is fine, it either
> contains a mult/shift that is valid, or it indicates TSC is messed up
> and you fall back to something else.
>
> There is no inbetween there.
>
> For sched_clock we can still use the global page, because the rate will
> still be the same for each cpu, it's just offset between CPUs and the
> code compensates for that.

But if weâre in a KVM guest, then the clock will jump around on the same *vCPU* when the vCPU migrates.

But I donât see how kvmclock helps here, since I donât think itâs used for sched_clock.