Re: [patch 2/2] x86/tsc: Force TSC_ADJUST register to value >= zero

From: Thomas Gleixner
Date: Fri Dec 16 2016 - 08:37:01 EST


On Fri, 16 Dec 2016, Thomas Gleixner wrote:
> On Tue, 13 Dec 2016, Thomas Gleixner wrote:
> > Roland reported that his DELL T5810 sports a value add BIOS which
> > completely wreckages the TSC. The squirmware [(TM) Ingo Molnar] boots with
> > random negative TSC_ADJUST values, different on all CPUs. That renders the
> > TSC useless because the sycnchronization check fails.
>
> While everyone assumed that this is the usual DELL squirmware problem, I
> have to say it's not.
>
> Just got my hands on a Skylake based Lenovo S510 box and it shows the same
> feature:
>
> TSC ADJUST: CPU0: -10123656703215
> CPU1: -10123656796701
> CPU2: -10123656797460
> CPU3: -10123656798366
>
> Which causes the TSC to be out of sync on a stock upstream kernel and the
> TSC deadline timer wreckage is happening on that machine as well.
>
> I'm pretty sure, that this well thought out feature to 'hide power on time'
> from TSC has not been independently 'invented' by DELL and Lenovo BIOS
> tinkerers.
>
> I rather have the impression that this is an advisory or feature kit from
> some other entity. Whoever came up with this misfeature at Intel and/or
> Microsoft (sorry, I could not come up with any other suspects) should be
> promoted to run the 'Linux on feature-plagued systems' hot line.

Just to add another data point here.

On cold boot the TSC_ADJUST value on that LENOVO machine is: -24534293,
which is about 9ms.

So assumed that the SDM is correct in this point and the counter starts at
0 after power on, then 9ms later might be right in that magic blob which
does the low level bringup of CPUs. That comes from the CPU vendor and runs
_BEFORE_ the system vendor BIOS can create havoc.

Dealing with timers on x86 feels like a Sisyphean task.

Thanks,

tglx