Re: [RFC PATCH 2/2] KVM: arm64: export cntvoff in debugfs

From: Marcelo Tosatti
Date: Fri Nov 19 2021 - 07:59:59 EST


On Fri, Nov 19, 2021 at 12:17:00PM +0000, Marc Zyngier wrote:
> On Fri, 19 Nov 2021 10:21:18 +0000,
> Nicolas Saenz Julienne <nsaenzju@xxxxxxxxxx> wrote:
> >
> > While using cntvct as the raw clock for tracing, it's possible to
> > synchronize host/guest traces just by knowing the virtual offset applied
> > to the guest's virtual counter.
> >
> > This is also the case on x86 when TSC is available. The offset is
> > exposed in debugfs as 'tsc-offset' on a per vcpu basis. So let's
> > implement the same for arm64.
>
> How does this work with NV, where the guest hypervisor is in control
> of the virtual offset? How does userspace knows which vcpu to pick so
> that it gets the right offset?

On x86, the offsets for different vcpus are the same due to the logic at
kvm_synchronize_tsc function:

During guest vcpu creation, when the TSC-clock values are written
in a short window of time (or the clock value is zero), the code uses
the same TSC.

This logic is problematic (since "short window of time" is a heuristic
which can fail), and is being replaced by writing the same offset
for each vCPU:

commit 828ca89628bfcb1b8f27535025f69dd00eb55207
Author: Oliver Upton <oupton@xxxxxxxxxx>
Date: Thu Sep 16 18:15:38 2021 +0000

KVM: x86: Expose TSC offset controls to userspace

To date, VMM-directed TSC synchronization and migration has been a bit
messy. KVM has some baked-in heuristics around TSC writes to infer if
the VMM is attempting to synchronize. This is problematic, as it depends
on host userspace writing to the guest's TSC within 1 second of the last
write.

A much cleaner approach to configuring the guest's views of the TSC is to
simply migrate the TSC offset for every vCPU. Offsets are idempotent,
and thus not subject to change depending on when the VMM actually
reads/writes values from/to KVM. The VMM can then read the TSC once with
KVM_GET_CLOCK to capture a (realtime, host_tsc) pair at the instant when
the guest is paused.

So with that in place, the answer to

How does userspace knows which vcpu to pick so
that it gets the right offset?

is any vcpu, since the offsets are the same.

> I also wonder why we need this when userspace already has direct
> access to that information without any extra kernel support (read the
> CNTVCT view of the vcpu using the ONEREG API, subtract it from the
> host view of the counter, job done).

If guest has access to the clock offset (between guest and host), then
in the guest:

clockval = hostclockval - clockoffset

Adding "clockoffset" to that will retrieve the host clock.

Is that what you mean?