Re: [PATCH 0/3] arm64: Allow early timestamping of kernel log

From: Pavel Tatashin
Date: Mon Sep 23 2019 - 15:13:27 EST


On Tue, Jul 23, 2019 at 3:17 AM Marc Zyngier <marc.zyngier@xxxxxxx> wrote:
>
> On Mon, 22 Jul 2019 21:52:42 +0100,
> Pavel Tatashin <pasha.tatashin@xxxxxxxxxx> wrote:
> >
> > On Mon, Jul 22, 2019 at 3:33 AM Marc Zyngier <marc.zyngier@xxxxxxx> wrote:
> > >
> > > So far, we've let the arm64 kernel start its meaningful time stamping
> > > of the kernel log pretty late, which is caused by sched_clock() being
> > > initialised rather late compared to other architectures.
> > >
> > > Pavel Tatashin proposed[1] to move the initialisation of sched_clock
> > > much earlier, which I had objections to. The reason for initialising
> > > sched_clock late is that a number of systems have broken counters, and
> > > we need to apply all kind of terrifying workarounds to avoid time
> > > going backward on the affected platforms. Being able to identify the
> > > right workaround comes pretty late in the kernel boot, and providing
> > > an unreliable sched_clock, even for a short period of time, isn't an
> > > appealing prospect.
> > >
> > > To address this, I'm proposing that we allow an architecture to chose
> > > to (1) divorce time stamping and sched_clock during the early phase of
> > > booting, and (2) inherit the time stamping clock as the new epoch the
> > > first time a sched_sched clock gets registered.

Hi Marc,

I know we briefly discussed this at plumbers, but I want to bring it
up again, because I am still puzzled why it is not possible to
stabilize unstable clock early in boot.

Here is an example where clock is stabilized:
https://soleen.com/source/xref/linux/kernel/sched/clock.c?r=457c8996#265

It uses a value that is read at last ticks to normalize clock, and
because ticks are not available early in boot instead we can make sure
that early in boot sched_clock() never returns value smaller than
previously returned value, and if we want to be extra careful, we can
also make sure that sched_clock() early in boot does not jump ahead by
more than some fixed amount of time i.e. more than one hour.

If sched_clock() is available early we will get the benefit of having
other tracers that use it to debug early boot information.

Pasha