Re: kvmclock - the host part.

From: Avi Kivity
Date: Wed Nov 07 2007 - 09:00:54 EST


Glauber de Oliveira Costa wrote:
void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu)
{
+ vcpu->time_needs_update = 1;
Why here and not in __vcpu_run()? It isn't timer irq related.
Because my plan was exactly, updating it at each timer interrupt.

I think kvm_inject_pending_timer_irqs() is called every __vcpu_run(), so your cunning plan has been foiled.

Did you mean each guest interrupt of host interrupt?

There's a trade off between
updating every run (hopefully more precision, but more overhead), versus
updating at timer irqs, or other events.

What would you prefer?

I think that we should update it every time a heavyweight exit has been taken. That takes care of the tradeoff quite nicely -- heavyweight exits are already dog slow.



+ /* Updating the tsc count is the first thing we do */
+ kvm_get_msr(vcpu, MSR_IA32_TIME_STAMP_COUNTER, &vcpu->hv_clock.last_tsc);
+ ktime_get_ts(&ts);
+ vcpu->hv_clock.now_ns = ts.tv_nsec + (NSEC_PER_SEC * (u64)ts.tv_sec);
+ vcpu->hv_clock.wc_sec = get_seconds();
+ vcpu->hv_clock.version++;
+
+ clock_addr = vcpu->clock_addr;
+ memcpy(clock_addr, &vcpu->hv_clock, sizeof(vcpu->hv_clock));
+ mark_page_dirty(vcpu->kvm, vcpu->clock_gfn);
Just use kvm_write_guest().
Too slow. Updating guest time, even only in timer interrupts, was a too
frequent operation, and the kmap / kunmap (atomic) at every iteration
deemed the whole thing
unusable.

kvm_write_guest() will eventually be a copy_to_user(), so you need not fear the overhead.


+ ret = 0;
switch (nr) {
+ case KVM_HCALL_REGISTER_CLOCK: {
+ struct kvm_vcpu *dst_vcpu;
+
+ if (!((a1 < KVM_MAX_VCPUS) && (vcpu->kvm->vcpus[a1]))) {
+ ret = -KVM_EINVAL;
+ break;
+ }
+
+ dst_vcpu = vcpu->kvm->vcpus[a1];
What if !dst_vcpu? What about locking?

Suggest simply using vcpu. Every guest cpu can register its own
Earlier version had a check for !dst_vcpu, you are absolutely right.

Locking was not a problem in practice, because these operations are done
serialized, by the same cpu.

Think evil guest that cares not for the well-being of the host.

This hypercall is called by cpu_up, which, at least in the beginning,
it's called by cpu0. And that's why each vcpu cannot register its own.
(And why we don't need locking).

Well, theorectically each vcpu do can register its own clocksource, it
will just be a little bit more complicated, we have to fire out an IPI,
and have the other cpu to catch it, and call the hypercall.


Can it not be done via the processor startup sequence? Then there's no need for ipis and locking.

I imagine a normal guest initializes the apic in the same way.

But I honestly don't like it.
Usually, the cpu leaves start_secondary with a clock already registered,
so the kernel relies on it.

+ dst_vcpu->clock_page = gfn_to_page(vcpu->kvm, a0 >> PAGE_SHIFT);
Shift right? Why?
a0 is not a gfn, but a physical address.

What if the guest wants to place it in address 5GB? That's unlikely for Linux and Windows, but let's do it right anyway.

--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/