Re: Use CPUID to communicate with the hypervisor.

From: Jeremy Fitzhardinge
Date: Fri Sep 26 2008 - 21:02:19 EST


Alok Kataria wrote:
> From: Alok N Kataria <akataria@xxxxxxxxxx>
>
> This patch proposes to use a cpuid interface to detect if we are running on an
> hypervisor.
> The discovery of a hypervisor is determined by bit 31 of CPUID#1_ECX, which is
> defined to be "hypervisor present bit". For a VM, the bit is 1, otherwise it is
> set to 0. This bit is not officially documented by either Intel/AMD yet, but
> they plan to do so some time soon, in the meanwhile they have promised to keep
> it reserved for virtualization.
>
> Also, Intel & AMD have reserved the cpuid levels 0x40000000 - 0x400000FF for
> software use. Hypervisors can use these levels to provide an interface to pass
> information from the hypervisor to the guest. This is similar to how we extract
> information about a physical cpu by using cpuid.
> XEN/KVM are already using the info leaf to get the hypervisor signature.
>
> VMware hardware version 7 defines some of these cpuid levels, below is a brief
> description about those. These levels can be implemented by other hypervisors
> too so that Linux has a standard way of communicating to any hypervisor.
>
> Leaf 0x40000000, Hypervisor CPUID information
> # EAX: The maximum input value for hypervisor CPUID info (0x40000010).
> # EBX, ECX, EDX: Hypervisor vendor ID signature. E.g. "VMwareVMware"
>
> Leaf 0x40000010, Timing information.
> # EAX: (Virtual) TSC frequency in kHz.
> # EBX: (Virtual) Bus (local apic timer) frequency in kHz.
> # ECX, EDX: RESERVED
>

I'm sympathetic to the idea, but it seems a bit under-defined.

Are you leaving a gap between 0x40000000 and -10 for what? Future
extension? Avoiding existing hypervisor-specific leaves?

I think there's a move towards doing a scan for a signature, such as
checking every 16 leaves after 0x40000000 for "a while" looking for
interesting signatures, so that a hypervisor can support multiple ABIs
at once. Given this, it would be better to define a "Generic Hypervisor
ABI" signature, and put all the related leaves together.

And then, rather than having a simple "maximum leaf", it would be better
to have cap bits for each specific feature. For example, how would the
"RESERVED" registers in "Timing information" ever get used? How would
you know that they were no longer reserved, but now meaningful?

That said, I'm a bit worried about the whole idea of having these kinds
of timing parameters. It does assume that they're constant for the
whole life of the VM. What if they change due to power management or
migration?

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/