On Tue, Apr 16, 2024 at 8:08 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
On Thu, Feb 15, 2024, Alejandro Jimenez wrote:
The goal of this RFC is to agree on a mechanism for querying the state (and
related stats) of APICv/AVIC. I clearly have an AVIC bias when approaching this
topic since that is the side that I have mostly looked at, and has the greater
number of possible inhibits, but I believe the argument applies for both
vendor's technologies.
Currently, a user or monitoring app trying to determine if APICv is actually
being used needs implementation-specific knowlegde in order to look for specific
types of #VMEXIT (i.e. AVIC_INCOMPLETE_IPI/AVIC_NOACCEL), checking GALog events
by watching /proc/interrupts for AMD-Vi*-GA, etc. There are existing tracepoints
(e.g. kvm_apicv_accept_irq, kvm_avic_ga_log) that make this task easier, but
tracefs is not viable in some scenarios. Adding kvm debugfs entries has similar
downsides. Suravee has previously proposed a new IOCTL interface[0] to expose
this information, but there has not been any development in that direction.
Sean has mentioned a preference for using BPF to extract info from the current
tracepoints, which would require reworking existing structs to access some
desired data, but as far as I know there isn't any work done on that approach
yet.
Recently Joao mentioned another alternative: the binary stats framework that is
already supported by kernel[1] and QEMU[2].
The hiccup with stats are that they are ABI, e.g. we can't (easily) ditch stats
once they're added, and KVM needs to maintain the exact behavior.
Stats are not ABI---why would they be? They have an established
meaning and it's not a good idea to change it, but it's not an
absolute no-no(*); and removing them is not a problem at all.
For example, if stats were ABI, there would be no need for the
introspection mechanism, you could just use a struct like ethtool
stats (which *are* ABO).
Not everything makes a good stat but, if in doubt and it's cheap
enough to collect it, go ahead and add it. Cheap collection is the
important point, because tracepoints in a hot path can be so expensive
as to slow down the guest substantially, at least in microbenchmarks.
In this case I'm not sure _all_ inhibits makes sense and I certainly
wouldn't want a bitmask,
makes sense, and perhaps another for a weirdly-configured local APIC.
Paolo
(*) you have to draw a line somewhere. New processor models may
virtualize parts of the CPU in such a way that some stats become
meaningless or just stay at zero. Should KVM not support those
features because it is not possible anymore to introspect the guest
through stat?
Tracepoints are explicitly not ABI, and so we can be much more permissive when it
comes to adding/expanding tracepoints, specifically because there are no guarantees
provided to userspace.