Re: [PATCH net-next v8 3/3] gve: implement PTP gettimex64

From: David Woodhouse

Date: Wed May 27 2026 - 16:03:41 EST


On Wed, 2026-05-27 at 11:46 -0700, Jordan Rhee wrote:
> Hello,
>
> We would like to request the community's guidance on how to proceed
> with adding
> cross-timestamp functionality to GVE. The key challenge is that the
> NIC clock can only be sampled in the hypervisor because the NIC time
> registers are only present on the PF, and a hypercall takes 60-80us.
> To minimize the uncertainty window, the hypervisor takes a TSC
> sandwich around the reading with an uncertainty window of ~1.2us,
> converts it to guest TSC, and returns it to the guest.
>
> The goals which led to this implementation were to minimize the
> uncertainty window and communicate the uncertainty window up the
> stack. I've reviewed
> https://lore.kernel.org/netdev/20260526165826.392227559@xxxxxxxxxx/
> but I'm not sure how we would use it, so if it is intended to support
> our use case I would appreciate if someone could explain how we
> should use it.
>
> Here are some options on how to proceed, but other ideas are welcome.

So, your hypercall gives you the ABA timestamps of
PTP_SYS_OFFSET_EXTENDED where the 'A' timestamps are in the form of
guest TSC readings? It does not give you *simultaneous* capture and
thus you can not, and should attempt to, implement
PTP_SYS_OFFSET_PRECISE? Don't even think about that "midpoint" crap?

And your driver brackets the hypercalls with *actual* guest TSC reads,
to keep the hypervisor honest?

Based on Thomas's branch, I'd probably do something like:

• ptp_read_system_prets()
• Make the hypercall
• ptp_read_system_postts()

Then do your sanity check that the guest TSC values provided by the
hypervisor are reasonable (inside the raw_cycles¹ values in the guest
pre/post snapshots). If not, return immediately using the wider
uncertainty window.

If you *do* believe the hypervisor-provided TSC readings, can't you
convert *each* of them to the system clock using
get_device_system_crosststamp(), a history from your captured prets,
and a get_time_fn callback that just creates xtstamp->sys_counter from
the hypervisor-provided TSC value for pre/post respectively, each time
you call it?

Take the code sample below with a *massive* pinch of salt; I only
shouted at the AI a few times when asking it to translate the
suggestion above into sample code, and I haven't gone over it with a
fine-tooth comb. But hopefully it helps enlighten what I'm
suggesting...

/*
* Suggested approach for GVE PTP gettimex64 with hypervisor-provided
* TSC values converted via get_device_system_crosststamp().
*
* This gives PTP_SYS_OFFSET_EXTENDED with ~1.2µs uncertainty
* (from the hypervisor's TSC sandwich) rather than ~60-80µs
* (from the guest-side hypercall latency).
*/

struct gve_xtstamp_ctx {
u64 cycles;
};

static int gve_get_time_fn(ktime_t *device_time,
struct system_counterval_t *sys_counter,
void *_ctx)
{
struct gve_xtstamp_ctx *ctx = _ctx;

*device_time = 0;
sys_counter->cycles = ctx->cycles;
sys_counter->cs_id = CSID_X86_TSC;
return 0;
}

static int gve_ptp_gettimex64(struct ptp_clock_info *info,
struct timespec64 *ts,
struct ptp_system_timestamp *sts)
{
struct gve_ptp *ptp = container_of(info, struct gve_ptp, info);
struct gve_sysclock_sample sysclock;
u64 nic_ts;
int err;

mutex_lock(&ptp->nic_ts_read_lock);
ptp_read_system_prets(sts);
err = gve_ptp_read_timestamp(ptp, &sysclock);
ptp_read_system_postts(sts);
mutex_unlock(&ptp->nic_ts_read_lock);
if (err)
return err;

nic_ts = sysclock.nic_timestamp;
*ts = ns_to_timespec64(nic_ts);

if (!sts)
return 0;

/*
* Sanity check: hypervisor-provided TSC values must fall
* within our guest-side bracket (using raw_cycles from the
* snapshots taken by ptp_read_system_prets/postts).
*/
if (sysclock.nic_pre_cycles >= sts->pre_sts.raw_cycles &&
sysclock.nic_post_cycles <= sts->post_sts.raw_cycles) {
struct system_device_crosststamp xtstamp;
struct gve_xtstamp_ctx ctx;

/*
* Hypervisor values are trustworthy. Convert each TSC
* value to system time, narrowing the pre/post window
* from ~60-80µs to ~1.2µs.
*/
ctx.cycles = sysclock.nic_pre_cycles;
err = get_device_system_crosststamp(gve_get_time_fn, &ctx,
&sts->pre_sts, &xtstamp);
if (!err) {
sts->pre_sts.sys = xtstamp.sys_realtime;
sts->pre_sts.raw = xtstamp.sys_monoraw;
sts->pre_sts.cycles = sysclock.nic_pre_cycles;
sts->pre_sts.cs_id = CSID_X86_TSC;
}

ctx.cycles = sysclock.nic_post_cycles;
err = get_device_system_crosststamp(gve_get_time_fn, &ctx,
&sts->pre_sts, &xtstamp);
if (!err) {
sts->post_sts.sys = xtstamp.sys_realtime;
sts->post_sts.raw = xtstamp.sys_monoraw;
sts->post_sts.cycles = sysclock.nic_post_cycles;
sts->post_sts.cs_id = CSID_X86_TSC;
}
}
/* else: keep the wider pre/post from ptp_read_system_prets/postts */

return 0;
}

¹ https://lore.kernel.org/all/20260526230635.136914-1-dwmw2@xxxxxxxxxxxxx/

Attachment: smime.p7s
Description: S/MIME cryptographic signature