Re: [BUG] Linux 5.3-rc1: timer problem on x86-64 (Pentium D)

From: Thomas Gleixner
Date: Thu Jul 25 2019 - 14:04:33 EST


Rui,

On Thu, 25 Jul 2019, Rui Salvaterra wrote:
> On Thu, 25 Jul 2019 at 07:28, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> >
> > The only reason I can think of is that the HPET on that machine has a weird
> > register state (it's not advertised by the BIOS ... )
> >
> > But that does not explain the boot failure completely. If the HPET is not
> > available then the kernel should automatically do the right thing and fall
> > back to something else.
>
> This may be a useful data point, the relevant part of the dmesg on a
> pristine 5.3-rc1 with clocksource=jiffies:

Duh. Yes, this explains it nicely.

> [ 1.123548] clocksource: timekeeping watchdog on CPU1: Marking
> clocksource 'tsc-early' as unstable because the skew is too large:
> [ 1.123552] clocksource: 'hpet' wd_now: 33
> wd_last: 33 mask: ffffffff

The HPET counter check succeeded, but the early enable and the following
reconfiguration confused it completely. So the HPET is not counting:

'hpet' wd_now: 33 wd_last: 33 mask: ffffffff

Which is a full explanation for the boot fail because if the counter is not
working then the HPET timer is not expiring and the early boot is waiting
for HPET to fire forever.

> > Then boot these kernels with 'hpet=disable' on the command line and see
> > whether they come up. If so please provide the same output.
>
> Fortunately (as I'm doing this remotely) they did come up.
> With hpet=disabledâ
>
> Linux 5.2:
> available_clocksource: tsc acpi_pm
> current_clocksource: tsc
>
> Linux 5.3-rc1 patched:
> available_clocksource: tsc acpi_pm
> current_clocksource: tsc

That's consistent with the above. 5.3-rc1 unpatched would of course boot as
well with hpet=disable now that we know the root cause.

I'll write a changelog and route it to Linus for -rc2.

Thanks a lot for debugging this and providing all the information!

tglx