RE: [PATCH v3] printk: fix zero-valued printk timestamps in early boot

From: Bird, Tim

Date: Tue Mar 10 2026 - 14:58:18 EST


Thank you both Petr and Shashank for the review and feedback.

See my responses inline below.

> -----Original Message-----
> From: Petr Mladek <pmladek@xxxxxxxx>
>
> On Tue 2026-03-10 04: 25: 33, Shashank Balaji wrote: > Hi again, > > On Tue, Feb 10, 2026 at 04: 47: 41PM -0700, Tim Bird wrote: > > During
> early boot, printk timestamps are reported as zero before > > kernel timekeeping starts
> On Tue 2026-03-10 04:25:33, Shashank Balaji wrote:
> > Hi again,
> >
> > On Tue, Feb 10, 2026 at 04:47:41PM -0700, Tim Bird wrote:
> > > During early boot, printk timestamps are reported as zero before
> > > kernel timekeeping starts (e.g. before time_init()). This
> > > hinders boot-time optimization efforts. This period is about 400
> > > milliseconds for many current desktop and embedded machines
> > > running Linux.
> > >
> > > Add support to save cycles during early boot, and output correct
> > > timestamp values after timekeeping is initialized. get_cycles()
> > > is operational on arm64 and x86_64 from kernel start. Add code
> > > and variables to save calibration values used to later convert
> > > cycle counts to time values in the early printks. Add a config
> > > to control the feature.
> > >
> > > This yields non-zero timestamps for printks from the very start
> > > of kernel execution. The timestamps are relative to the start of
> > > the architecture-specified counter used in get_cycles
> > > (e.g. the TSC on x86_64 and cntvct_el0 on arm64).
> > >
> > > All timestamps reflect time from processor power-on instead of
> > > time from the kernel's timekeeping initialization.
> > >
> > > Signed-off-by: Tim Bird <tim.bird@xxxxxxxx>
> >
> > So if a console is read before the cycles -> timestamp conversion can
> > happen, then they'll see 0. But reading from userspace will give the
> > right timestamps.
> >
> > Based on the previous discussions, to address this possible confusion,
> > if changing the timestamp format, like adding '?', is a no-go because
> > of concerns of breaking existing monitoring tools, what about appending
> > something to the printk string after the timestamp? Hmm, no, that'll
> > affect grep'ability _and_ may break monitoring tools. Or what about a
> > pr_warn() early in boot to warn about the possible timestamp difference?
>
> Or we could make it more obvious from the message in
> early_times_finish_calibration(), see below.
>
> > At the very least the possibility of this difference should be
> > documented in the Kconfig description.
>
> Yeah, it would be nice to mention this in the Kconfig description.

I'll work on a v4 version of the patch, and add some text
to mention this.

>
> > > --- /dev/null
> > > +++ b/include/linux/early_times.h
> > > @@ -0,0 +1,85 @@
> > > +static inline void early_times_finish_calibration(void)
> > > +{
> > > + cycles_t end_cycles;
> > > + u64 end_ns;
> > > +
> > > + /* set calibration data for early_printk_times */
> > > + end_cycles = get_cycles();
> > > + end_ns = local_clock();
> > > + clocks_calc_mult_shift(&early_mult, &early_shift,
> > > + mul_u64_u64_div_u64(end_cycles - start_cycles,
> > > + NSEC_PER_SEC, end_ns - start_ns),
> > > + NSEC_PER_SEC, 100);
> > > + early_ts_offset = mul_u64_u32_shr(start_cycles, early_mult, early_shift) - start_ns;
> > > +
> > > + pr_debug("Early printk times: mult=%u, shift=%u, offset=%llu ns\n",
> > > + early_mult, early_shift, early_ts_offset);
>
> We might make it more obvious that an offset will get added to the
> existing timestamp since this point.
>
> Also it has a "surprising" user visible effect so that it should
> be pr_info() instead of pr_debug(). Note pr_debug() messages might be hidden.
>
> A minimalist change would be:
>
> pr_info("Calibrated offset for early printk times: mult=%u, shift=%u, offset=%llu ns\n",
> early_mult, early_shift, early_ts_offset);

This sounds like a good idea. Will do.


> And/Or we might add one more line:
>
> pr_info("The time offset is added for existing and newly added printk messages since now!");
I think adding a message is a possibility, to explain the offset.

Let me consider the wordings for this. Using 'now' is a bit ambiguous, because the offset is
always used in post-boot output, such as from dmesg or journalctl, even for messages earlier
in the log.

>
> > > +}
> > > +
> > > --- a/kernel/printk/printk.c
> > > +++ b/kernel/printk/printk.c
> > > @@ -639,7 +647,7 @@ static void append_char(char **pp, char *e, char c)
> > > static ssize_t info_print_ext_header(char *buf, size_t size,
> > > struct printk_info *info)
> > > {
> > > - u64 ts_usec = info->ts_nsec;
> > > + u64 ts_usec = adjust_early_ts(info->ts_nsec);
> >
> > printk_get_next_message() calls info_print_ext_header() for an extended
> > console (/dev/kmsg and netcon_ext use this), whereas for
> > non-extended consoles, record_print_text() -> info_print_prefix() ->
> > print_time() is called. So, this adjustment should be made in
> > print_time() too, otherwise non-extended console users are gonna be
> > spooked with insane timestamps.
>
> The v3 patch already modifies print_time().
>
> > This may explain the non-zero early
> > timestamps Petr saw in his serial console output [1].
>
> I am a bit confused now. There are three stages:
>
> 1. Early messages where the cycles are stored.
>
> The serial console shows zero time stamp because
> it reads the messages _before the calibration_, e.g.
>
> [ 0.000000] Linux version 6.19.0-rc7-default+ (pmladek@pathway) (gcc (SUSE Linux) 15.2.1 20251006, GNU ld (GNU Binutils;
> openSUSE Tumbleweed) 2.45.0.20251103-2) #521 SMP PREEMPT_DYNAMIC Mon Feb 2 16:36:53 CET 2026
> [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.19.0-rc7-default+ root=UUID=587ae802-e330-4059-9b48-
> d5b845e1075a resume=/dev/disk/by-uuid/369c7453-3d16-409d-88b2-5de027891a12 mitigations=auto nosplash
> earlycon=uart8250,io,0x3f8,115200 console=ttyS0,115200 console=ttynull console=tty0 debug_non_panic_cpus=1 panic=10 ignore_loglevel
> log_buf_len=1M
> [ 0.000000] BIOS-provided physical RAM map:
> [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> [ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
>
> But "dmesg" shows some numbers because it reads the messages
> _after the calibration_:
>
> [ 8.853613] Linux version 6.19.0-rc7-default+ (pmladek@pathway) (gcc (SUSE Linux) 15.2.1 20251006, GNU ld (GNU Binutils;
> openSUSE Tumbleweed) 2.45.0.20251103-2) #521 SMP PREEMPT_DYNAMIC Mon Feb 2 16:36:53 CET 2026
> [ 8.853617] Command line: BOOT_IMAGE=/boot/vmlinuz-6.19.0-rc7-default+ root=UUID=587ae802-e330-4059-9b48-
> d5b845e1075a resume=/dev/disk/by-uuid/369c7453-3d16-409d-88b2-5de027891a12 mitigations=auto nosplash
> earlycon=uart8250,io,0x3f8,115200 console=ttyS0,115200 console=ttynull console=tty0 debug_non_panic_cpus=1 panic=10 ignore_loglevel
> log_buf_len=1M
> [ 8.865086] BIOS-provided physical RAM map:
> [ 8.865087] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> [ 8.865089] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
>
>
> 2. Early messages added _after the timekeeping_ is initialized
> but _before the early cycles calibration_.
>
> They serial console prints them _without the offset_ because
> it reads them _before the calibration_, e.g.
>
> [ 3.288049][ T1] Write protecting the kernel read-only data: 36864k
> [ 3.298554][ T1] Freeing unused kernel image (text/rodata gap) memory: 1656K
> [ 3.318942][ T1] Freeing unused kernel image (rodata/data gap) memory: 1540K
>
> But "dmesg" prints them _with the offset_ because it reads them
> _after the calibration_, e.g.
>
> [ 12.179999] [ T1] Write protecting the kernel read-only data: 36864k
> [ 12.190505] [ T1] Freeing unused kernel image (text/rodata gap) memory: 1656K
> [ 12.210893] [ T1] Freeing unused kernel image (rodata/data gap) memory: 1540K
>
>
> 3. Messages added after the calibration of the early cycles.
>
> They are printed with the offset by both serial console and
> dmesg, e.g.
>
> [ 12.230014][ T1] Early printk times: mult=38775352, shift=27, offset=8891950261 ns
> [ 12.246008][ T1] Run /init as init process
> [ 12.254944][ T1] with arguments:
> [ 12.264341][ T1] /init

This is correct. I don't want to overwhelm users of this, but there are three time-gathering periods, and roughly
2 output times (before calibration and after calibration).

early boot = before time init, before cycles calibration and offset calculation
mid boot = after time init, before cycles calibration and offset calculation
late boot = after time init, after cycles calibration and offset calculation
All of these are before the start of user space processes.

time of printk output time timestamp type stored timestamp output
------------------ ---------------- ----------------------------- ------------------------
early boot early boot cycles 0
mid boot mid-boot nanosecs seconds, with offset from time_init
late boot late-boot nanosecs seconds with offset from cycle counter start
-----
early boot post-boot (cycles already stored) seconds with offset from cycle counter start
mid boot post-boot (ns already stored) seconds with offset from cycle counter start
late boot post-boot (ns already stored) seconds with offset from cycle counter start
all others post-boot nanosecs seconds with offset from cycle counter start

The confusing thing is messages that are output to the console before calibration and offset calculation.
Reports from all user space tools (ie from dmesg or journalctl) should be correct and consistent.

>
> > An accessor can be implemented for (struct printk_info).ts_nsec, say
> > get_timestamp(), which can be called from both the places.
>
> Yeah, a helper function for reading the timestamp might be a cleaner solution.
>

I consider adjust_early_ts() to be such an accessor function. It's supposed
to hide the details of the type of the timestamp (cycles or ns) and the offset.
Maybe this could be renamed to something better, like: get_adjusted_ts()?
Let me know what you think.

> > [1] https://lore.kernel.org/all/aYDPn2EJgJIWGDhM@pathway/
>
> Everything seems to be as expected there. The non-zero timestamps
> on the serial console are from messages added after the timekeeping
> was initialized.

I'll rebase this patch to resolve the #include conflicts, and address
this feedback, and hopefully get a new version out this week.

Thanks again for the discussion and ideas!
-- Tim