Re: [PATCH 4/4] kvm: x86: export TSC offset to user-space
From: Luiz Capitulino
Date: Fri Sep 02 2016 - 21:29:36 EST
On Fri, 2 Sep 2016 20:49:37 -0300
Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote:
> On Fri, Sep 02, 2016 at 09:43:01AM -0400, Stefan Hajnoczi wrote:
> > On Wed, Aug 31, 2016 at 01:05:45PM -0400, Luiz Capitulino wrote:
> > > We need to retrieve a VM's TSC offset in order to use
> > > the host's TSC to merge host and guest traces. This is
> > > explained in detail in this thread:
> > >
> > > [Qemu-devel] [RFC] host and guest kernel trace merging
> > > https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg00887.html
> > >
> > > Today, the only way to retrieve a VM's TSC offset is
> > > by using the kvm_write_tsc_offset tracepoint. This has
> > > a few problems. First, the tracepoint is only emitted
> > > when the VM boots, which requires a reboot to get it if
> > > the VM is already running. Second, tracepoints are not
> > > supposed to be ABIs in case they need to be consumed by
> > > user-space tools.
> > >
> > > This commit exports a VM's TSC offset to user-space via
> > > debugfs. A new file called "tsc-offset" is created in
> > > the VM's debugfs directory. For example:
> > >
> > > /sys/kernel/debug/kvm/51696-10/tsc-offset
> > >
> > > This file contains one TSC offset per line, for each
> > > vCPU. For example:
> > >
> > > vcpu0: 18446742405270834952
> > > vcpu1: 18446742405270834952
> > > vcpu2: 18446742405270834952
> > > vcpu3: 18446742405270834952
> > >
> > > There are some important observations about this
> > > solution:
> > >
> > > - While all vCPUs TSC offsets should be equal for the
> > > cases we care about (ie. stable TSC and no write to
> > > the TSC MSR), I chose to follow the spec and export
> > > each vCPU's TSC offset (might also be helpful for
> > > debugging)
> > >
> > > - The TSC offset is only useful after the VM has booted
> > >
> > > - We'll probably need to export the TSC multiplier too.
> > > However, I've been using only the TSC offset for now.
> > > So, let's get this merged first and do the TSC multiplier
> > > as a second step
> >
> > Can TSC offset changes occur at runtime?
> >
> > One example is vcpu hotplug where the tracing tool would need to fetch
> > the new vcpu's TSC offset after tracing has already started.
> >
> > Another example is if QEMU or the guest change the TSC offset while
> > running. If the tracing tool doesn't notice this then trace events will have
> > incorrect timestamps.
> >
> > Stefan
>
> Yes they can, and the interface should mention that "the user is
> responsible for handling races of execution" (IMO).
>
> So the workflow is:
>
> 1) User boots VM and knows the state of the VM.
> 2) User runs trace-cmd on the host.
>
> Is there a need to automate gathering of traces? (that is to know the
> state of reboots and so forth). I don't see one. In that case, the above
> workflow is functional.
>
> Can you add such comments to the interface Luiz (that the value
> read is potentially stale).
Sure, no problem.