Re: [PATCH v2] KVM: x86: Prevent exposing TSC deadline timer featurein the absence of in-kernel APIC

From: Avi Kivity
Date: Sun Dec 25 2011 - 07:38:32 EST

On 12/22/2011 05:41 PM, Liu, Jinsong wrote:
> Avi Kivity wrote:
> > On 12/21/2011 12:25 PM, Jan Kiszka wrote:
> >> We must not report the TSC deadline timer feature on our own when
> >> user space provides the APIC as we have no clue about its features.
> >
> > We must not report the TSC deadline timer feature on our own, period.
> > We should just update the timer mode mask there. Don't know how this
> > slipped through review.
> >
> > I think your original idea was correct. Add a new KVM_CAP for the tsc
> > deadline timer. Userspace can add the bit to cpuid if either it
> > implements the feature in a userspace apic, or if it finds the new
> > capability and uses the kernel apic.
> Is it necessary to use KVM_CAP? If I didn't misunderstand, the KVM_CAP sulotion would be:
> 1. qemu get kvm tsc deadline timer capability by KVM_CAP_...;
> 2. qemu add cpuid bit
> if ((guest use qemu apic && qemu emualte tsc deadline timer) ||
> (guest use kvm apic && kvm emulate tsc deadline timer (KVM_CAP)))
> 3. qemu ioctl KVM_SET_CPUID2
> 4. kvm expose the feature to guest by saving it at vcpu->arch.cpuid_entries,


> seems it's logically redundant.

What's logically redundant?

> Jan's patch v2 is a straight forward and simple fix. in the patch
> if (apic) { ... }
> means apic (and then its sub-logic tsc deadline timer) emulated by kvm, that's enough:
> if quest use kvm apic, it's OK to add cpuid bit and expose to guest;
> if guest don't use kvm apic, it will not touch cpuid bit;

It breaks live migration: if you start a guest on a TSC-deadline capable
host kernel, and migrate it to a TSC-deadline incapable host kernel, you
end up with a broken guest.

More broadly, kvm never exposes features transparently to the guest, it
always passes them to userspace first, so userspace controls the ABI
exposed to the guest. This prevents the following scenario:

- a guest is started on some hardware, which doesn't support some cpuid
feature (say AVX for example)
- the guest or one of its applications are broken wrt AVX, but because
the feature is not exposed, it works correctly
- the host hardware is upgraded to one which supports AVX
- the guest is now broken

(the downside is that guests run slower than they would with automatic
feature exposure, but that's better than breaking them)

error compiling committee.c: too many arguments to function

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at