RE: [PATCH v3] printk: fix zero-valued printk timestamps in early boot
From: Bird, Tim
Date: Fri Mar 13 2026 - 00:53:33 EST
Hey Micheal,
This report is very interesting.
Thanks very much for trying it out!
> -----Original Message-----
> From: Michael Kelley <mhklinux@xxxxxxxxxxx>
> Sent: Wednesday, March 11, 2026 9:47 AM
> From: Tim Bird <tim.bird@xxxxxxxx> Sent: Tuesday, February 10, 2026 3:48 PM
> >
> > During early boot, printk timestamps are reported as zero before
> > kernel timekeeping starts (e.g. before time_init()). This
> > hinders boot-time optimization efforts. This period is about 400
> > milliseconds for many current desktop and embedded machines
> > running Linux.
> >
> > Add support to save cycles during early boot, and output correct
> > timestamp values after timekeeping is initialized. get_cycles()
> > is operational on arm64 and x86_64 from kernel start. Add code
> > and variables to save calibration values used to later convert
> > cycle counts to time values in the early printks. Add a config
> > to control the feature.
> >
> > This yields non-zero timestamps for printks from the very start
> > of kernel execution. The timestamps are relative to the start of
> > the architecture-specified counter used in get_cycles
> > (e.g. the TSC on x86_64 and cntvct_el0 on arm64).
> >
> > All timestamps reflect time from processor power-on instead of
> > time from the kernel's timekeeping initialization.
>
> I tried this patch in linux-next20260302 kernel running as a guest VM
> on a Hyper-V host. Two things:
>
> 1) In the dmesg output, I'm seeing a place where the timestamps briefly go
> backwards -- i.e., they are not monotonically increasing. Here's a snippet,
> where there's a smaller timestamp immediately after the tsc is detected:
>
> [ 27.994891] SMBIOS 3.1.0 present.
> [ 27.994893] DMI: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 09/25/2025
> [ 27.994898] DMI: Memory slots populated: 2/2
> [ 27.995202] Hypervisor detected: Microsoft Hyper-V
> [ 27.995205] Hyper-V: privilege flags low 0xae7f, high 0x3b8030, ext 0x62, hints 0xa0e24, misc 0xe0bed7b2
> [ 27.995208] Hyper-V: Nested features: 0x0
> [ 27.995209] Hyper-V: LAPIC Timer Frequency: 0xc3500
> [ 27.995210] Hyper-V: Using hypercall for remote TLB flush
> [ 27.995216] clocksource: hyperv_clocksource_tsc_page: mask: 0xffffffffffffffff max_cycles: 0x24e6a1710, max_idle_ns: 440795202120 ns
> [ 27.995218] clocksource: hyperv_clocksource_msr: mask: 0xffffffffffffffff max_cycles: 0x24e6a1710, max_idle_ns: 440795202120 ns
> [ 27.995220] tsc: Detected 2918.401 MHz processor
I wonder if the tsc is getting fiddled with or virtualized somewhere in here, as part of clocksource initialization.
I believe each clocksource in the kernel maintains it's own internal offset, and maybe the offset that is
being used ends up being slightly different from the cycle-counter offset that the early_times feature uses.
I'm just throwing out guesses. It's about a 4ms delta, which is pretty big.
> [ 27.991060] e820: update [mem 0x00000000-0x00000fff] System RAM ==> device reserved
> [ 27.991062] e820: remove [mem 0x000a0000-0x000fffff] System RAM
> [ 27.991064] last_pfn = 0x210000 max_arch_pfn = 0x400000000
> [ 27.991065] x86/PAT: PAT support disabled because CONFIG_X86_PAT is disabled in the kernel.
> [ 27.991066] x86/PAT: Configuration [0-7]: WB WT UC- UC WB WT UC- UC
>
> Hyper-V provides a synthetic clocksource (two actually), and perhaps they
> are the cause of the problem, though I haven't spent any time debugging.
>
> 2) A Linux VM running in the Azure cloud is also running on Hyper-V. Such a
> VM typically uses cloud-init to set everything up at boot time, and cloud-init
> is outputting lines to the serial console with a timestamp that looks like the
> printk() timestamp, but apparently is not adjusted for the early timestamping
> that this patch adds. Again, I haven't debugged what's going on -- I'm not
> immediately sure of the mechanism that cloud-init uses to do output to the
> serial console. The use of the Hyper-V synthetic clock source might the cause
> of the problem here as well. Here's an output snippet from the serial console:
>
> [ 20.330414] systemd[1]: Condition check resulted in OpenVSwitch configuration for cleanup being skipped.
> [ 20.332911] systemd[1]: Condition check resulted in Set Up Additional Binary Formats being skipped.
> [ 20.333257] pstore: Registered efi_pstore as persistent store backend
> [ 20.334360] systemd[1]: Condition check resulted in File System Check on Root Device being skipped.
> [ 20.338319] systemd[1]: Starting Load Kernel Modules...
> [ 20.341094] systemd[1]: Starting Remount Root and Kernel File Systems...
> [ 20.350993] systemd[1]: Starting udev Coldplug all Devices...
> [ 20.356255] systemd[1]: Starting Uncomplicated firewall...
> [ 20.361536] systemd[1]: Started Journal Service.
> [ 20.386902] EXT4-fs (sda1): re-mounted c02dce0c-0c40-4e6e-88af-c5a0987b0adb r/w.
> [ 22.532033] /dev/sr0: Can't lookup blockdev
> [ 7.955973] cloud-init[783]: Cloud-init v. 24.3.1-0ubuntu0~20.04.1 running 'init-local' at Wed, 11 Mar 2026 15:27:06 +0000. Up 7.48
> seconds.
> [ 9.933120] cloud-init[822]: Cloud-init v. 24.3.1-0ubuntu0~20.04.1 running 'init' at Wed, 11 Mar 2026 15:27:08 +0000. Up 9.82 seconds.
> [ 9.935483] cloud-init[822]: ci-info: ++++++++++++++++++++++++++++++++++++++Net device
> info+++++++++++++++++++++++++++++++++++++++
> [ 9.937726] cloud-init[822]: ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
> [ 9.939905] cloud-init[822]: ci-info: | Device | Up | Address | Mask | Scope | Hw-Address |
> [ 9.942059] cloud-init[822]: ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
>
> The cloud-init lines don't show up in dmesg, so there's no problem there.
Are the timestamp values for cloud-init usually consistent with the ones from the kernel?
That is, without the early_times patch, do they usually match kernel's printk timestamps?
It sounds like these cloud-init messages are emitted directly to the serial console, and do
not go through the printk system. Is that right? How do they avoid having the printk
messages and cloud-init messages intermingle? I assume cloud-init is a systemd service?
Can you share the calibration lines from this boot? I'm curious if the timestamps
just relative to local_clock().
> I will look into both issues further, but probably not today.
Thanks for trying the code out. Let me know if you find out anything.
-- Tim
> Michael
>
> >
> > Signed-off-by: Tim Bird <tim.bird@xxxxxxxx>
> > ---
> > V2->V3
> > Default CONFIG option to 'n'
> > Move more code from into early_times.h
> > (reducing ifdefs in init/main.c)
> > Use math64 helper routines
> > Use cycles_t instead of u64 type
> > Add #defines for EARLY_CYCLES_BIT and MASK
> > Invert if logic in adjust_early_ts()
> > (note: no change to 'depends on' in Kconfig entry)
> >
> > V1->V2
> > Remove calibration CONFIG vars
> > Add 'depends on' to restrict arches (to handle ppc bug)
> > Add early_ts_offset to avoid discontinuity
> > Save cycles in ts_nsec, and convert on output
> > Move conditional code to include file (early_times.h)
> >
> > include/linux/early_times.h | 85 +++++++++++++++++++++++++++++++++++++
> > init/Kconfig | 14 ++++++
> > init/main.c | 6 +++
> > kernel/printk/printk.c | 18 +++++++-
> > 4 files changed, 121 insertions(+), 2 deletions(-)
> > create mode 100644 include/linux/early_times.h
> >
> > diff --git a/include/linux/early_times.h b/include/linux/early_times.h
> > new file mode 100644
> > index 000000000000..05388dcb8573
> > --- /dev/null
> > +++ b/include/linux/early_times.h
> > @@ -0,0 +1,85 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +
> > +#ifndef _KERNEL_PRINTK_EARLY_TIMES_H
> > +#define _KERNEL_PRINTK_EARLY_TIMES_H
> > +
> > +#include <linux/timex.h>
> > +#include <linux/clocksource.h>
> > +
> > +/* use high bit of a u64 to indicate cycles instead of a timestamp */
> > +#define EARLY_CYCLES_BIT BIT_ULL(63)
> > +#define EARLY_CYCLES_MASK ~(BIT_ULL(63))
> > +
> > +#if defined(CONFIG_EARLY_PRINTK_TIMES)
> > +extern cycles_t start_cycles;
> > +extern u64 start_ns;
> > +extern u32 early_mult, early_shift;
> > +extern u64 early_ts_offset;
> > +
> > +static inline void early_times_start_calibration(void)
> > +{
> > + start_cycles = get_cycles();
> > + start_ns = local_clock();
> > +}
> > +
> > +static inline void early_times_finish_calibration(void)
> > +{
> > + cycles_t end_cycles;
> > + u64 end_ns;
> > +
> > + /* set calibration data for early_printk_times */
> > + end_cycles = get_cycles();
> > + end_ns = local_clock();
> > + clocks_calc_mult_shift(&early_mult, &early_shift,
> > + mul_u64_u64_div_u64(end_cycles - start_cycles,
> > + NSEC_PER_SEC, end_ns - start_ns),
> > + NSEC_PER_SEC, 100);
> > + early_ts_offset = mul_u64_u32_shr(start_cycles, early_mult, early_shift) -
> > start_ns;
> > +
> > + pr_debug("Early printk times: mult=%u, shift=%u, offset=%llu ns\n",
> > + early_mult, early_shift, early_ts_offset);
> > +}
> > +
> > +static inline u64 early_cycles(void)
> > +{
> > + return (get_cycles() | EARLY_CYCLES_BIT);
> > +}
> > +
> > +/*
> > + * adjust_early_ts detects whether ts in is cycles or nanoseconds
> > + * and converts it or adjusts it, taking into account the offset
> > + * from cycle-counter start.
> > + *
> > + * Note that early_mult may be 0, but that's OK because
> > + * we'll just multiply by 0 and return 0. This will
> > + * only occur if we're outputting a printk message
> > + * before the calibration of the early timestamp.
> > + * Any output after user space start (eg. from dmesg or
> > + * journalctl) will show correct values.
> > + */
> > +static inline u64 adjust_early_ts(u64 ts)
> > +{
> > + if (likely(!(ts & EARLY_CYCLES_BIT)))
> > + /* if timestamp is not in cycles, just add offset */
> > + return ts + early_ts_offset;
> > +
> > + /* mask high bit and convert to nanoseconds */
> > + return mul_u64_u32_shr(ts & EARLY_CYCLES_MASK, early_mult, early_shift);
> > +}
> > +
> > +#else
> > +# define early_times_start_calibration() do { } while (0)
> > +# define early_times_finish_calibration() do { } while (0)
> > +
> > +static inline u64 early_cycles(void)
> > +{
> > + return 0;
> > +}
> > +
> > +static inline u64 adjust_early_ts(u64 ts)
> > +{
> > + return ts;
> > +}
> > +#endif /* CONFIG_EARLY_PRINTK_TIMES */
> > +
> > +#endif /* _KERNEL_PRINTK_EARLY_TIMES_H */
> > diff --git a/init/Kconfig b/init/Kconfig
> > index fa79feb8fe57..a928c1efb09d 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -777,6 +777,20 @@ config IKHEADERS
> > or similar programs. If you build the headers as a module, a module called
> > kheaders.ko is built which can be loaded on-demand to get access to headers.
> >
> > +config EARLY_PRINTK_TIMES
> > + bool "Show non-zero printk timestamps early in boot"
> > + default n
> > + depends on PRINTK
> > + depends on ARM64 || X86_64
> > + help
> > + Use a cycle-counter to provide printk timestamps during
> > + early boot. This allows seeing timestamps for printks that
> > + would otherwise show as 0. Note that this will shift the
> > + printk timestamps to be relative to processor power on, instead
> > + of relative to the start of kernel timekeeping. This should be
> > + closer to machine power on, giving a better indication of
> > + overall boot time.
> > +
> > config LOG_BUF_SHIFT
> > int "Kernel log buffer size (16 => 64KB, 17 => 128KB)"
> > range 12 25
> > diff --git a/init/main.c b/init/main.c
> > index b84818ad9685..d5774aec1aff 100644
> > --- a/init/main.c
> > +++ b/init/main.c
> > @@ -104,6 +104,7 @@
> > #include <linux/pidfs.h>
> > #include <linux/ptdump.h>
> > #include <linux/time_namespace.h>
> > +#include <linux/early_times.h>
> > #include <net/net_namespace.h>
> >
> > #include <asm/io.h>
> > @@ -1118,6 +1119,9 @@ void start_kernel(void)
> > timekeeping_init();
> > time_init();
> >
> > + /* This must be after timekeeping is initialized */
> > + early_times_start_calibration();
> > +
> > /* This must be after timekeeping is initialized */
> > random_init();
> >
> > @@ -1600,6 +1604,8 @@ static int __ref kernel_init(void *unused)
> >
> > do_sysctl_args();
> >
> > + early_times_finish_calibration();
> > +
> > if (ramdisk_execute_command) {
> > ret = run_init_process(ramdisk_execute_command);
> > if (!ret)
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index 1d765ad242b8..5afd31c3345c 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -46,6 +46,7 @@
> > #include <linux/ctype.h>
> > #include <linux/uio.h>
> > #include <linux/sched/clock.h>
> > +#include <linux/early_times.h>
> > #include <linux/sched/debug.h>
> > #include <linux/sched/task_stack.h>
> > #include <linux/panic.h>
> > @@ -75,6 +76,13 @@ EXPORT_SYMBOL(ignore_console_lock_warning);
> >
> > EXPORT_TRACEPOINT_SYMBOL_GPL(console);
> >
> > +#ifdef CONFIG_EARLY_PRINTK_TIMES
> > +cycles_t start_cycles;
> > +u64 start_ns;
> > +u32 early_mult, early_shift;
> > +u64 early_ts_offset;
> > +#endif
> > +
> > /*
> > * Low level drivers may need that to know if they can schedule in
> > * their unblank() callback or not. So let's export it.
> > @@ -639,7 +647,7 @@ static void append_char(char **pp, char *e, char c)
> > static ssize_t info_print_ext_header(char *buf, size_t size,
> > struct printk_info *info)
> > {
> > - u64 ts_usec = info->ts_nsec;
> > + u64 ts_usec = adjust_early_ts(info->ts_nsec);
> > char caller[20];
> > #ifdef CONFIG_PRINTK_CALLER
> > u32 id = info->caller_id;
> > @@ -1352,7 +1360,11 @@ static size_t print_syslog(unsigned int level, char *buf)
> >
> > static size_t print_time(u64 ts, char *buf)
> > {
> > - unsigned long rem_nsec = do_div(ts, 1000000000);
> > + unsigned long rem_nsec;
> > +
> > + ts = adjust_early_ts(ts);
> > +
> > + rem_nsec = do_div(ts, 1000000000);
> >
> > return sprintf(buf, "[%5lu.%06lu]",
> > (unsigned long)ts, rem_nsec / 1000);
> > @@ -2242,6 +2254,8 @@ int vprintk_store(int facility, int level,
> > * timestamp with respect to the caller.
> > */
> > ts_nsec = local_clock();
> > + if (!ts_nsec)
> > + ts_nsec = early_cycles();
> >
> > caller_id = printk_caller_id();
> >
> > --
> > 2.43.0
> >