Re: [PATCH net-next v8 3/3] gve: implement PTP gettimex64
From: Jordan Rhee
Date: Thu May 28 2026 - 12:47:39 EST
On Wed, May 27, 2026 at 1:03 PM David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote:
>
> On Wed, 2026-05-27 at 11:46 -0700, Jordan Rhee wrote:
> > Hello,
> >
> > We would like to request the community's guidance on how to proceed
> > with adding
> > cross-timestamp functionality to GVE. The key challenge is that the
> > NIC clock can only be sampled in the hypervisor because the NIC time
> > registers are only present on the PF, and a hypercall takes 60-80us.
> > To minimize the uncertainty window, the hypervisor takes a TSC
> > sandwich around the reading with an uncertainty window of ~1.2us,
> > converts it to guest TSC, and returns it to the guest.
> >
> > The goals which led to this implementation were to minimize the
> > uncertainty window and communicate the uncertainty window up the
> > stack. I've reviewed
> > https://lore.kernel.org/netdev/20260526165826.392227559@xxxxxxxxxx/
> > but I'm not sure how we would use it, so if it is intended to support
> > our use case I would appreciate if someone could explain how we
> > should use it.
> >
> > Here are some options on how to proceed, but other ideas are welcome.
>
> So, your hypercall gives you the ABA timestamps of
> PTP_SYS_OFFSET_EXTENDED where the 'A' timestamps are in the form of
> guest TSC readings? It does not give you *simultaneous* capture and
> thus you can not, and should attempt to, implement
> PTP_SYS_OFFSET_PRECISE? Don't even think about that "midpoint" crap?
>
> And your driver brackets the hypercalls with *actual* guest TSC reads,
> to keep the hypervisor honest?
>
> Based on Thomas's branch, I'd probably do something like:
>
> • ptp_read_system_prets()
> • Make the hypercall
> • ptp_read_system_postts()
>
> Then do your sanity check that the guest TSC values provided by the
> hypervisor are reasonable (inside the raw_cycles¹ values in the guest
> pre/post snapshots). If not, return immediately using the wider
> uncertainty window.
>
> If you *do* believe the hypervisor-provided TSC readings, can't you
> convert *each* of them to the system clock using
> get_device_system_crosststamp(), a history from your captured prets,
> and a get_time_fn callback that just creates xtstamp->sys_counter from
> the hypervisor-provided TSC value for pre/post respectively, each time
> you call it?
Thank you David, that's a great idea. We will respin with this
approach after https://lore.kernel.org/netdev/20260526165826.392227559@xxxxxxxxxx/
lands. In the meantime, we have some other GVE patches that we need to
send up.
>
> Take the code sample below with a *massive* pinch of salt; I only
> shouted at the AI a few times when asking it to translate the
> suggestion above into sample code, and I haven't gone over it with a
> fine-tooth comb. But hopefully it helps enlighten what I'm
> suggesting...
>
> /*
> * Suggested approach for GVE PTP gettimex64 with hypervisor-provided
> * TSC values converted via get_device_system_crosststamp().
> *
> * This gives PTP_SYS_OFFSET_EXTENDED with ~1.2µs uncertainty
> * (from the hypervisor's TSC sandwich) rather than ~60-80µs
> * (from the guest-side hypercall latency).
> */
>
> struct gve_xtstamp_ctx {
> u64 cycles;
> };
>
> static int gve_get_time_fn(ktime_t *device_time,
> struct system_counterval_t *sys_counter,
> void *_ctx)
> {
> struct gve_xtstamp_ctx *ctx = _ctx;
>
> *device_time = 0;
> sys_counter->cycles = ctx->cycles;
> sys_counter->cs_id = CSID_X86_TSC;
> return 0;
> }
>
> static int gve_ptp_gettimex64(struct ptp_clock_info *info,
> struct timespec64 *ts,
> struct ptp_system_timestamp *sts)
> {
> struct gve_ptp *ptp = container_of(info, struct gve_ptp, info);
> struct gve_sysclock_sample sysclock;
> u64 nic_ts;
> int err;
>
> mutex_lock(&ptp->nic_ts_read_lock);
> ptp_read_system_prets(sts);
> err = gve_ptp_read_timestamp(ptp, &sysclock);
> ptp_read_system_postts(sts);
> mutex_unlock(&ptp->nic_ts_read_lock);
> if (err)
> return err;
>
> nic_ts = sysclock.nic_timestamp;
> *ts = ns_to_timespec64(nic_ts);
>
> if (!sts)
> return 0;
>
> /*
> * Sanity check: hypervisor-provided TSC values must fall
> * within our guest-side bracket (using raw_cycles from the
> * snapshots taken by ptp_read_system_prets/postts).
> */
> if (sysclock.nic_pre_cycles >= sts->pre_sts.raw_cycles &&
> sysclock.nic_post_cycles <= sts->post_sts.raw_cycles) {
> struct system_device_crosststamp xtstamp;
> struct gve_xtstamp_ctx ctx;
>
> /*
> * Hypervisor values are trustworthy. Convert each TSC
> * value to system time, narrowing the pre/post window
> * from ~60-80µs to ~1.2µs.
> */
> ctx.cycles = sysclock.nic_pre_cycles;
> err = get_device_system_crosststamp(gve_get_time_fn, &ctx,
> &sts->pre_sts, &xtstamp);
> if (!err) {
> sts->pre_sts.sys = xtstamp.sys_realtime;
> sts->pre_sts.raw = xtstamp.sys_monoraw;
> sts->pre_sts.cycles = sysclock.nic_pre_cycles;
> sts->pre_sts.cs_id = CSID_X86_TSC;
> }
>
> ctx.cycles = sysclock.nic_post_cycles;
> err = get_device_system_crosststamp(gve_get_time_fn, &ctx,
> &sts->pre_sts, &xtstamp);
> if (!err) {
> sts->post_sts.sys = xtstamp.sys_realtime;
> sts->post_sts.raw = xtstamp.sys_monoraw;
> sts->post_sts.cycles = sysclock.nic_post_cycles;
> sts->post_sts.cs_id = CSID_X86_TSC;
> }
> }
> /* else: keep the wider pre/post from ptp_read_system_prets/postts */
>
> return 0;
> }
>
> ¹ https://lore.kernel.org/all/20260526230635.136914-1-dwmw2@xxxxxxxxxxxxx/