Re: [PATCH 09/13] KVM: arm64: Add clock for hyp tracefs

From: John Stultz
Date: Fri Sep 13 2024 - 19:21:59 EST


On Wed, Sep 11, 2024 at 2:31 AM Vincent Donnefort <vdonnefort@xxxxxxxxxx> wrote:
>
> Configure the hypervisor tracing clock before starting tracing. For
> tracing purpose, the boot clock is interesting as it doesn't stop on
> suspend. However, it is corrected on a regular basis, which implies we
> need to re-evaluate it every once in a while.
>
> Cc: John Stultz <jstultz@xxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Stephen Boyd <sboyd@xxxxxxxxxx>
> Cc: Christopher S. Hall <christopher.s.hall@xxxxxxxxx>
> Cc: Richard Cochran <richardcochran@xxxxxxxxx>
> Cc: Lakshmi Sowjanya D <lakshmi.sowjanya.d@xxxxxxxxx>
> Signed-off-by: Vincent Donnefort <vdonnefort@xxxxxxxxxx>
>
...
> +static void __hyp_clock_work(struct work_struct *work)
> +{
> + struct delayed_work *dwork = to_delayed_work(work);
> + struct hyp_trace_buffer *hyp_buffer;
> + struct hyp_trace_clock *hyp_clock;
> + struct system_time_snapshot snap;
> + u64 rate, delta_cycles;
> + u64 boot, delta_boot;
> + u64 err = 0;
> +
> + hyp_clock = container_of(dwork, struct hyp_trace_clock, work);
> + hyp_buffer = container_of(hyp_clock, struct hyp_trace_buffer, clock);
> +
> + ktime_get_snapshot(&snap);
> + boot = ktime_to_ns(snap.boot);
> +
> + delta_boot = boot - hyp_clock->boot;
> + delta_cycles = snap.cycles - hyp_clock->cycles;
> +
> + /* Compare hyp clock with the kernel boot clock */
> + if (hyp_clock->mult) {
> + u64 cur = delta_cycles;
> +
> + cur *= hyp_clock->mult;

Mult overflow protection (I see you already have a max_delta value) is
probably needed here.

> + cur >>= hyp_clock->shift;
> + cur += hyp_clock->boot;
> +
> + err = abs_diff(cur, boot);
> +
> + /* No deviation, only update epoch if necessary */
> + if (!err) {
> + if (delta_cycles >= hyp_clock->max_delta)
> + goto update_hyp;
> +
> + goto resched;
> + }
> +
> + /* Warn if the error is above tracing precision (1us) */
> + if (hyp_buffer->tracing_on && err > NSEC_PER_USEC)
> + pr_warn_ratelimited("hyp trace clock off by %lluus\n",
> + err / NSEC_PER_USEC);

I'm curious in practice, does this come up often? If so, does it
converge down nicely? Have you done much disruption testing using
adjtimex?

> + }
> +
> + if (delta_boot > U32_MAX) {
> + do_div(delta_boot, NSEC_PER_SEC);
> + rate = delta_cycles;
> + } else {
> + rate = delta_cycles * NSEC_PER_SEC;
> + }
> +
> + do_div(rate, delta_boot);
> +
> + clocks_calc_mult_shift(&hyp_clock->mult, &hyp_clock->shift,
> + rate, NSEC_PER_SEC, CLOCK_MAX_CONVERSION_S);
> +
> +update_hyp:
> + hyp_clock->max_delta = (U64_MAX / hyp_clock->mult) >> 1;
> + hyp_clock->cycles = snap.cycles;
> + hyp_clock->boot = boot;
> + kvm_call_hyp_nvhe(__pkvm_update_clock_tracing, hyp_clock->mult,
> + hyp_clock->shift, hyp_clock->boot, hyp_clock->cycles);
> + complete(&hyp_clock->ready);

I'm very forgetful, so maybe it's unnecessary, but for future-you or
just other's like me, it might be worth adding some extra comments to
clarify the assumptions in these calculations.


thanks
-john