RE: [PATCH v4 1/1] printk: fix zero-valued printk timestamps in early boot

From: Bird, Tim

Date: Tue Apr 28 2026 - 01:39:18 EST


> -----Original Message-----
> From: Andrew Murray <amurray@xxxxxxxxxxxxxxxxxxxx>
> Hi Tim,
>
> On Fri, 10 Apr 2026 at 21:41, Tim Bird <tim.bird@xxxxxxxx> wrote:
> >
> > During early boot, printk timestamps are reported as zero before
> > kernel timekeeping starts (i.e. before time_init()). This
> > hinders boot-time optimization efforts. This period ranges from
> > 17 to 1700 milliseconds on different embedded machines running Linux.
> >
> > Add support for early timestamps based on processor cycle-generators
> > that need no kernel initialization. Use a statically-configured
> > frequency value for converting cycles to nanoseconds. This means
> > that this feature cannot be turned on in generic distro kernels.
> > It is intended for temporary use during kernel development and
> > boot-time research and optimization only.
> >
> > This yields non-zero timestamps for printks from the very start
> > of kernel execution. The timestamps are relative to the start of
> > an architecture-specific counter (e.g. tsc on x86_64 and cntvct_el0
> > on arm64). Affected timestamps reflect time from cycle counter
> > init (usually machine power-on or virtual machine start) instead of
> > time from the kernel's timekeeping initialization. This results in a
> > discontinuity in the printk timestamp values, one time, when
> > kernel timekeeping starts.
> >
> > Signed-off-by: Tim Bird <tim.bird@xxxxxxxx>
> > ---
>
> The limitation of this series is that while it narrows the window
> where delays may be hiding, it doesn't remove the window completely.
> For example, there may be a delay prior to the first printk (perhaps
> unlikely). It would instead be useful to understand how much time has
> passed since execution was handed over to the kernel.

Hey Andrew,

My patch allows one to add a printk as the very first line in start_kernel(), which
I believe is the earliest you can do so in C code in the kernel proper.

You are correct that this patch doesn’t measure time before
that first line of C code accessible to instrument with printks. So there's
more blind spots in the total system boot than this patch addresses.

>
> Also, in my view, the goal of this work is really to identify
> regressions in boot time, though this can be achieved via other means.
> Replacing zero timestamps in printk messages with real timestamps is
> only necessary to make it easier to bisect where those delays are
> hiding, i.e. between two printk messages. But that benefit it limited
> given how small the window is. (And if a regression is spotted,
> someone would likely add debug to find it anyway).

In my experience, although the kernel's blind spot is relatively small on x86, it
is quite a bit larger on other platforms: 183 milliseconds on a raspberry pi
and 1700 milliseconds on a FPGA-based riscv board. When I've used the
patch in its current form, along with sprinkling printks and measurements
in other place in init/main.c and other relevant early functions in the kernel,
it has yielded valuable data about the timings in this blind spot, that I can
easily read using the console and dmesg.

>
> Given the risks identified by others with this series and the above, I
> thought I'd suggest an alternative approach.
>
> Perhaps we can rely on a prior boot stage (e.g. U-Boot) to capture a
> timestamp (perhaps cntvct on arm) immediately prior to kernel
> handover, this could be stored in the device tree and it could include
> architecture information on where that timestamp came from. During
> boot, once timekeeping is initialised, the architecture specific
> timekeeping initialisation code can print relevant timestamp values to
> the console, alongside the data captured in the device tree.
>
> Userspace tools (systemd-analyze) or log parsing tools, can then make
> sense of the values. The time between kernel handover and the end of
> timekeeping initalisation printk can be calculated. You could also
> subtract the earliest non-zero printk timestamp from this figure to
> measure the time between kernel handover and first non-zero timestamp.
>
> Besides logging values from hardware registers to the console, this
> approach pushes out all the maths to userspace or log parsing tools.
> It also doesn't impact existing printk timestamps and can be
> structured to not introduce new get_cycles types of functions.

It is certainly of value to get a measure of the time between
the bootloader and the kernel start (ie, the kernel self-decompressor).
But that's a different problem than the one I originally intended
to solve with this patch, which was the zero-valued printks in the kernel,
before kernel timekeeping start.

Because my code is using the same cycle counter as U-boot, it will operate
with the same time-base, making it easier to correlate the times that U-boot
records with the times reported in the early printks, using an external tool.
One could certainly output cycles in a few places and set up a correspondence
using external tooling.
That's what this patch was striving for:
https://lore.kernel.org/all/20250823044034.189939-1-v-singh1@xxxxxx/

That was posted as part of the effort described in [1].

That would be a valuable thing to do, and possibly there is some coordination
we could do to harmonize these efforts or have them use each other's facilities.
But again, that's solving a different problem than the one my patch solves.

>
> If this type of approach is taken, it may be worth investigating other
> initiatives in this area, for example [1], [2], [3] which may be
> relevant.
>
> [1] https://static.sched.com/hosted_files/osseu2025/a2/EOSS_2025_Unified%20Boot%20Time%20log%20based%20measurement.pdf
> [2] https://issues.chromium.org/issues/40986147
> [3] https://github.com/FirmwareHandoff/firmware_handoff

I worry that if I change the problem I'm trying to solve, or the scope
of the required external tools, it will be harder to get a patch accepted
to solve the thing I'm interested in.
-- Tim


> Thanks,
>
> Andrew Murray
>
> > V3 -> V4
> > Replace config vars with single one: CONFIG_EARLY_CYCLES_KHZ
> > Replace runtime calibration with static config variable
> > Remove reference to get_cycles()
> > Add support for RISCV platforms
> > V2 -> V3
> > Default CONFIG option to 'n'
> > Move more code into early_times.h (reduce ifdefs in init/main.c)
> > Use match64 helper routines
> > Use cycles_t instead of u64 type
> > Add #defines for EARLY_CYCLES_BIT and EARLY_CYCLES_MASK
> > Invert if logic in adjust_early_ts()
> > V1 -> V2
> > Remove calibration CONFIG vars
> > Add 'depends on' to restrict arches (to handle ppc bug)
> > Add early_ts_offset to avoid discontinuity
> > Save cycles in ts_nsec, and convert on output
> > Move conditional code to include file (early_times.h>
> > ---
> > include/linux/early_times.h | 55 +++++++++++++++++++++++++++++++++++++
> > init/Kconfig | 30 ++++++++++++++++++++
> > kernel/printk/printk.c | 3 ++
> > 3 files changed, 88 insertions(+)
> > create mode 100644 include/linux/early_times.h
> >
> > diff --git a/include/linux/early_times.h b/include/linux/early_times.h
> > new file mode 100644
> > index 000000000000..82bacfd0e26b
> > --- /dev/null
> > +++ b/include/linux/early_times.h
> > @@ -0,0 +1,55 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +
> > +#ifndef _EARLY_TIMES_H
> > +#define _EARLY_TIMES_H
> > +
> > +#include <linux/timekeeping.h>
> > +#ifdef CONFIG_ARM64
> > +#include <asm/sysreg.h>
> > +#endif
> > +
> > +#ifdef CONFIG_EARLY_CYCLES_KHZ
> > +static inline u64 early_unsafe_cycles(void)
> > +{
> > +#if defined(CONFIG_X86_64)
> > + /*
> > + * This rdtsc may happen before secure TSC is initialized, and
> > + * it is unordered. So please don't use this value for cryptography
> > + * or after SMP is initialized.
> > + */
> > + return rdtsc();
> > +#elif defined(CONFIG_ARM64)
> > + return read_sysreg(cntvct_el0);
> > +#elif defined(CONFIG_RISCV_TIMER)
> > + u64 val;
> > +
> > + asm volatile("rdtime %0" : "=r"(val));
> > + return val;
> > +#else
> > + return 0;
> > +#endif
> > +}
> > +
> > +#define NS_PER_KHZ 1000000UL
> > +
> > +/* returns a nanosecond value based on early cycles */
> > +static inline u64 early_times_ns(void)
> > +{
> > + if (CONFIG_EARLY_CYCLES_KHZ)
> > + /*
> > + * Note: the multiply must precede the division to avoid
> > + * truncation and loss of resolution
> > + * Don't use fancier MULT/SHIFT math here. Since this is
> > + * static, the compiler can optimize the math operations.
> > + */
> > + return (early_unsafe_cycles() * NS_PER_KHZ) / CONFIG_EARLY_CYCLES_KHZ;
> > + return 0;
> > +}
> > +#else
> > +static inline u64 early_times_ns(void)
> > +{
> > + return 0;
> > +}
> > +#endif
> > +
> > +#endif /* _EARLY_TIMES_H */
> > diff --git a/init/Kconfig b/init/Kconfig
> > index 7484cd703bc1..40c3123c2c27 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -792,6 +792,36 @@ config IKHEADERS
> > or similar programs. If you build the headers as a module, a module called
> > kheaders.ko is built which can be loaded on-demand to get access to headers.
> >
> > +config EARLY_CYCLES_KHZ
> > + int "Frequency of the early times cycle counter, in KHz"
> > + default 0
> > + depends on PRINTK
> > + depends on ARM64 || X86_64 || RISCV_TIMER
> > + help
> > + Specifies the frequency value (in KHz) of the cycle counter
> > + used for early times information.
> > +
> > + Set this to provide timing information for printks in early boot
> > + (before the start of kernel timekeeping), that would otherwise
> > + show as 0.
> > +
> > + To calculate the value for your system, boot your system, and
> > + use the value expressed on the tsc or cntvct_el0 calibration line
> > + in your kernel message log:
> > + ex x86_64: tsc: Refined TSC clocksource calibration: 2095.057 MHz
> > + In that case, use the value 2095057.
> > + ex arm64: arch_timer: cp15 timer running at 54.00MHz (phys).
> > + In that case, use the value of 54000.
> > +
> > + Note that this causes the kernel to show, for some early printks,
> > + times that are relative to processor power on, instead of
> > + relative to the start of kernel timekeeping. When kernel
> > + timekeeping starts, the timestamps values reset, causing
> > + a discontinuity in the timestamp values. If you see
> > + non-monotonic printk timestamps, please don't freak out.
> > +
> > + Set to 0 to disable this feature.
> > +
> > config LOG_BUF_SHIFT
> > int "Kernel log buffer size (16 => 64KB, 17 => 128KB)"
> > range 12 25
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index 0323149548f6..ddeab81f04ef 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -46,6 +46,7 @@
> > #include <linux/ctype.h>
> > #include <linux/uio.h>
> > #include <linux/sched/clock.h>
> > +#include <linux/early_times.h>
> > #include <linux/sched/debug.h>
> > #include <linux/sched/task_stack.h>
> > #include <linux/panic.h>
> > @@ -2294,6 +2295,8 @@ int vprintk_store(int facility, int level,
> > * timestamp with respect to the caller.
> > */
> > ts_nsec = local_clock();
> > + if (unlikely(!ts_nsec))
> > + ts_nsec = early_times_ns();
> >
> > caller_id = printk_caller_id();
> >
> > --
> > 2.43.0
> >
> >