Re: [PATCH v4 1/1] printk: fix zero-valued printk timestamps in early boot

From: Andrew Murray

Date: Mon Apr 27 2026 - 19:33:38 EST


Hi Tim,

On Fri, 10 Apr 2026 at 21:41, Tim Bird <tim.bird@xxxxxxxx> wrote:
>
> During early boot, printk timestamps are reported as zero before
> kernel timekeeping starts (i.e. before time_init()). This
> hinders boot-time optimization efforts. This period ranges from
> 17 to 1700 milliseconds on different embedded machines running Linux.
>
> Add support for early timestamps based on processor cycle-generators
> that need no kernel initialization. Use a statically-configured
> frequency value for converting cycles to nanoseconds. This means
> that this feature cannot be turned on in generic distro kernels.
> It is intended for temporary use during kernel development and
> boot-time research and optimization only.
>
> This yields non-zero timestamps for printks from the very start
> of kernel execution. The timestamps are relative to the start of
> an architecture-specific counter (e.g. tsc on x86_64 and cntvct_el0
> on arm64). Affected timestamps reflect time from cycle counter
> init (usually machine power-on or virtual machine start) instead of
> time from the kernel's timekeeping initialization. This results in a
> discontinuity in the printk timestamp values, one time, when
> kernel timekeeping starts.
>
> Signed-off-by: Tim Bird <tim.bird@xxxxxxxx>
> ---

The limitation of this series is that while it narrows the window
where delays may be hiding, it doesn't remove the window completely.
For example, there may be a delay prior to the first printk (perhaps
unlikely). It would instead be useful to understand how much time has
passed since execution was handed over to the kernel.

Also, in my view, the goal of this work is really to identify
regressions in boot time, though this can be achieved via other means.
Replacing zero timestamps in printk messages with real timestamps is
only necessary to make it easier to bisect where those delays are
hiding, i.e. between two printk messages. But that benefit it limited
given how small the window is. (And if a regression is spotted,
someone would likely add debug to find it anyway).

Given the risks identified by others with this series and the above, I
thought I'd suggest an alternative approach.

Perhaps we can rely on a prior boot stage (e.g. U-Boot) to capture a
timestamp (perhaps cntvct on arm) immediately prior to kernel
handover, this could be stored in the device tree and it could include
architecture information on where that timestamp came from. During
boot, once timekeeping is initialised, the architecture specific
timekeeping initialisation code can print relevant timestamp values to
the console, alongside the data captured in the device tree.

Userspace tools (systemd-analyze) or log parsing tools, can then make
sense of the values. The time between kernel handover and the end of
timekeeping initalisation printk can be calculated. You could also
subtract the earliest non-zero printk timestamp from this figure to
measure the time between kernel handover and first non-zero timestamp.

Besides logging values from hardware registers to the console, this
approach pushes out all the maths to userspace or log parsing tools.
It also doesn't impact existing printk timestamps and can be
structured to not introduce new get_cycles types of functions.

If this type of approach is taken, it may be worth investigating other
initiatives in this area, for example [1], [2], [3] which may be
relevant.

[1] https://static.sched.com/hosted_files/osseu2025/a2/EOSS_2025_Unified%20Boot%20Time%20log%20based%20measurement.pdf
[2] https://issues.chromium.org/issues/40986147
[3] https://github.com/FirmwareHandoff/firmware_handoff

Thanks,

Andrew Murray

> V3 -> V4
> Replace config vars with single one: CONFIG_EARLY_CYCLES_KHZ
> Replace runtime calibration with static config variable
> Remove reference to get_cycles()
> Add support for RISCV platforms
> V2 -> V3
> Default CONFIG option to 'n'
> Move more code into early_times.h (reduce ifdefs in init/main.c)
> Use match64 helper routines
> Use cycles_t instead of u64 type
> Add #defines for EARLY_CYCLES_BIT and EARLY_CYCLES_MASK
> Invert if logic in adjust_early_ts()
> V1 -> V2
> Remove calibration CONFIG vars
> Add 'depends on' to restrict arches (to handle ppc bug)
> Add early_ts_offset to avoid discontinuity
> Save cycles in ts_nsec, and convert on output
> Move conditional code to include file (early_times.h>
> ---
> include/linux/early_times.h | 55 +++++++++++++++++++++++++++++++++++++
> init/Kconfig | 30 ++++++++++++++++++++
> kernel/printk/printk.c | 3 ++
> 3 files changed, 88 insertions(+)
> create mode 100644 include/linux/early_times.h
>
> diff --git a/include/linux/early_times.h b/include/linux/early_times.h
> new file mode 100644
> index 000000000000..82bacfd0e26b
> --- /dev/null
> +++ b/include/linux/early_times.h
> @@ -0,0 +1,55 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef _EARLY_TIMES_H
> +#define _EARLY_TIMES_H
> +
> +#include <linux/timekeeping.h>
> +#ifdef CONFIG_ARM64
> +#include <asm/sysreg.h>
> +#endif
> +
> +#ifdef CONFIG_EARLY_CYCLES_KHZ
> +static inline u64 early_unsafe_cycles(void)
> +{
> +#if defined(CONFIG_X86_64)
> + /*
> + * This rdtsc may happen before secure TSC is initialized, and
> + * it is unordered. So please don't use this value for cryptography
> + * or after SMP is initialized.
> + */
> + return rdtsc();
> +#elif defined(CONFIG_ARM64)
> + return read_sysreg(cntvct_el0);
> +#elif defined(CONFIG_RISCV_TIMER)
> + u64 val;
> +
> + asm volatile("rdtime %0" : "=r"(val));
> + return val;
> +#else
> + return 0;
> +#endif
> +}
> +
> +#define NS_PER_KHZ 1000000UL
> +
> +/* returns a nanosecond value based on early cycles */
> +static inline u64 early_times_ns(void)
> +{
> + if (CONFIG_EARLY_CYCLES_KHZ)
> + /*
> + * Note: the multiply must precede the division to avoid
> + * truncation and loss of resolution
> + * Don't use fancier MULT/SHIFT math here. Since this is
> + * static, the compiler can optimize the math operations.
> + */
> + return (early_unsafe_cycles() * NS_PER_KHZ) / CONFIG_EARLY_CYCLES_KHZ;
> + return 0;
> +}
> +#else
> +static inline u64 early_times_ns(void)
> +{
> + return 0;
> +}
> +#endif
> +
> +#endif /* _EARLY_TIMES_H */
> diff --git a/init/Kconfig b/init/Kconfig
> index 7484cd703bc1..40c3123c2c27 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -792,6 +792,36 @@ config IKHEADERS
> or similar programs. If you build the headers as a module, a module called
> kheaders.ko is built which can be loaded on-demand to get access to headers.
>
> +config EARLY_CYCLES_KHZ
> + int "Frequency of the early times cycle counter, in KHz"
> + default 0
> + depends on PRINTK
> + depends on ARM64 || X86_64 || RISCV_TIMER
> + help
> + Specifies the frequency value (in KHz) of the cycle counter
> + used for early times information.
> +
> + Set this to provide timing information for printks in early boot
> + (before the start of kernel timekeeping), that would otherwise
> + show as 0.
> +
> + To calculate the value for your system, boot your system, and
> + use the value expressed on the tsc or cntvct_el0 calibration line
> + in your kernel message log:
> + ex x86_64: tsc: Refined TSC clocksource calibration: 2095.057 MHz
> + In that case, use the value 2095057.
> + ex arm64: arch_timer: cp15 timer running at 54.00MHz (phys).
> + In that case, use the value of 54000.
> +
> + Note that this causes the kernel to show, for some early printks,
> + times that are relative to processor power on, instead of
> + relative to the start of kernel timekeeping. When kernel
> + timekeeping starts, the timestamps values reset, causing
> + a discontinuity in the timestamp values. If you see
> + non-monotonic printk timestamps, please don't freak out.
> +
> + Set to 0 to disable this feature.
> +
> config LOG_BUF_SHIFT
> int "Kernel log buffer size (16 => 64KB, 17 => 128KB)"
> range 12 25
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 0323149548f6..ddeab81f04ef 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -46,6 +46,7 @@
> #include <linux/ctype.h>
> #include <linux/uio.h>
> #include <linux/sched/clock.h>
> +#include <linux/early_times.h>
> #include <linux/sched/debug.h>
> #include <linux/sched/task_stack.h>
> #include <linux/panic.h>
> @@ -2294,6 +2295,8 @@ int vprintk_store(int facility, int level,
> * timestamp with respect to the caller.
> */
> ts_nsec = local_clock();
> + if (unlikely(!ts_nsec))
> + ts_nsec = early_times_ns();
>
> caller_id = printk_caller_id();
>
> --
> 2.43.0
>
>