Re: [RFC PATCH v2 0/8] timekeeping: Fix draft tracking precision and add feed-forward discipline via vmclock

From: David Woodhouse

Date: Mon May 25 2026 - 05:14:32 EST


On Mon, 2026-05-25 at 10:08 +0200, Miroslav Lichvar wrote:
> On Thu, May 21, 2026 at 10:54:41AM +0100, David Woodhouse wrote:
> > On Thu, 2026-05-21 at 08:35 +0200, Miroslav Lichvar wrote:
> > > Ok, but I don't see why the phase corrections of the guest need to be
> > > in the kernel.
> >
> > I'm not sure I understand. 
>
> <..clarification...>
>
> /* Compute phase offset at cycle_last and set time_offset to slew */
> ...
> ntp_set_time_offset(tk->id, ref_err >> tk->tkr_mono.shift);
>

Ah, I see. Thanks.

But that's just using ->time_offset which has *always* been in the
kernel.

It's the same mechanism to apply phase offset that everything else
(adjtime(), adjtimex(ADJ_SETOFFSET)) already uses.

The only thing that's different here is the calculation I elided
between the comment and ntp_set_time_offset() call shown there, which
is calculating *precisely* the offset to set in order to match the
desired reference.

There's nothing fundamental in the actual *timekeeping* here that
hasn't already been in the guest kernel for decades; I'm just fixing a
few arithmetic errors in the core code, and then *driving* it more
precisely using its existing parameters (tick_length, time_offset).

> There might be a disagreement on terminology.

Those will be entirely my fault.

> > It seems that when Julien et al lamented that, "Until now, however,
> > there has been a serious practical issue inhibiting feed-forward
> > approaches: a lack of kernel support", the basics were actually there
> > in the kernel's core timekeeping all along.
>
> From my point of view, the only missing piece is software timestamping
> of packets using other clocks than CLOCK_REALTIME.

For literal NTP, you mean? Yes, that makes sense. And having the NIC
timestamp the packets using PTM would be great too.

> > > > And TSC scaling is pretty much x86-specific; other architectures have a
> > > > *defined* counter frequency and don't need to support scaling.
> > >
> > > There can be a software fallback if hardware scaling and/or offset is
> > > not supported.
> >
> > Right. This *is* the software fallback, because the hardware scaling
> > and offset aren't sufficient even if we only care about x86 where the
> > former is supported.
>
> IMHO it's a solution done at a wrong layer.

Understood. What do you believe is the better solution?

Aside from the case of actually using NTP or a PHC to discipline the
kernel's CLOCK_REALTIME, the use cases I'm trying to enable are:

• (Micro)VM guest is *given* the TSC→realtime relationship in a virt
enlightenment, gets an interrupt whenever it changes. Can react to
that interrupt and steer the kernel's timekeeping as quickly as any
userspace dæmon could do anything.

• Dedicated virtual hosting environment needs to discipline the *TSC*
directly against external references (PHC, 1PPS) in order to provide
said virt enlightenment directly to guests and allow for accurate
migration. This environment does not care about the host's actual
CLOCK_REALTIME; that's basically cosmetic for logging purposes.

• Multi-purpose environment has a standard ntpd/chrony setup, wants
QEMU to be able to provide the same virt enlightenment based on
the kernel's own timekeeping.

Thomas and I seemed to be agreeing on a clock_[sg]et_time_reference()
API which would allow for all of the above, with basically no change to
the kernel's actual timekeeping: again, it's just exposing the
*existing* parameters allowing for more precise control and visibility.

Especially as userspace currently has no way to see what the kernel
*thinks* the time should be at any given moment. It can only see the
actual output of CLOCK_REALTIME, which is sawtoothing *around* the
'intended' value tracked by ntp_error, tick by tick.

I was about to knock up a prototype of that (probably based on ioctls
or read/write on a miscdev for now, just for the proof of concept. All
the boilerplate of actual system call stuff can come later, if we like
it).

If you have a better suggestion, I'm more than happy to entertain it.

Attachment: smime.p7s
Description: S/MIME cryptographic signature