On Fri, Oct 12, 2018 at 08:20:17PM +0800, Wei Wang wrote:
Guest changing MSR_CORE_PERF_GLOBAL_CTRL causes KVM to reprogram pmcYea gawds, that's horrific. Why does it do that? We have
counters, which re-allocates a host perf event. This process is
PERF_EVENT_IOC_PERIOD which does that much better. Still, what you're
proposing is faster still -- if it is correct.
This patch implements a fast path to handle the guest change ofWhat you're failing to explain here is why exactly it is ok to write to
MSR_CORE_PERF_GLOBAL_CTRL for the guest pmi case. Guest change of the
msr will be applied to the hardware when entering the guest, and the
old perf event will continue to be used. The guest setting of the
perf counter for the next irq period in pmi will also be written
directly to the hardware counter when entering the guest.
the MSR directly without updating the perf_event state. I didn't take
the time to go through all that, but it certainly needs documenting.
This is something that can certainly get broken by accident.
Is there any documentation/comment that explains how this virtual PMU
crud works in general?
+u64 intel_pmu_disable_guest_counters(void)OK, this them gets the MSR written when we re-enter the guest, after the
+{
+ struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+ u64 mask = cpuc->intel_ctrl_host_mask;
+
+ cpuc->intel_ctrl_host_mask = ULONG_MAX;
+
+ return mask;
+}
+EXPORT_SYMBOL_GPL(intel_pmu_disable_guest_counters);
WRMSR trap, right?
+ /*
+ * The guest PMI handler is asking for enabling the perf
+ * counters. This happens at the end of the guest PMI handler,
+ * so clear in_pmi.
+ */
+ intel_pmu_enable_guest_counters(pmu->counter_mask);
+ pmu->in_pmi = false;
+ }
+}
The v4 PMI handler does not in fact do that I think.
@@ -237,9 +267,23 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)But all this relies on the event calling the overflow handler; how does
default:
if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
(pmc = get_fixed_pmc(pmu, msr))) {
- if (!msr_info->host_initiated)
- data = (s64)(s32)data;
- pmc->counter += data - pmc_read_counter(pmc);
+ if (pmu->in_pmi) {
+ /*
+ * Since we are not re-allocating a perf event
+ * to reconfigure the sampling time when the
+ * guest pmu is in PMI, just set the value to
+ * the hardware perf counter. Counting will
+ * continue after the guest enables the
+ * counter bit in MSR_CORE_PERF_GLOBAL_CTRL.
+ */
+ struct hw_perf_event *hwc =
+ &pmc->perf_event->hw;
+ wrmsrl(hwc->event_base, data);
this not corrupt the event state such that x86_perf_event_set_period()
might decide that the generated PMI is a spurious one?