One thing that's missing is documentation of the guest/host ABI. ItI will add such document. It will includes:
will be a requirement for inclusion, but it will also be a great help
for review, so please provide it ASAP.
1) Data struct perf_event definition. Guest os and host os have to share the same
data structure as host kernel will sync data (counte/overflows and others if needed)
changes back to guest os.
2) A list of all hypercalls
3) Guest need have NMI handler which checks all guest events.
Disabling the watchdog is unfortunate. Why is it necessary?perf always uses NMI, so we disable the nmi_watchdog when a perf_event is
set up in case they might have impact.
I use perf_event->shadow at both host and guest side.+This is suspicious. Passing virtual addresses to the host is always
+static int kvm_pmu_enable(struct perf_event *event)
+{
+ int ret;
+ unsigned long addr = __pa(event);
+
+ if (kvm_add_event(event))
+ return -1;
+
+ ret = kvm_hypercall3(KVM_PERF_OP, KVM_PERF_OP_ENABLE,
+ addr, (unsigned long) event->shadow);
problematic. Or event->shadow only used as an opaque cookie?
1) At host side, perf_event->shadow points to an area including the page
mapping information about guest perf_event. Host kernel uses it to sync data
changes back to guest;
2) At guest side, perf_event->shadow save the virtual address of host
perf_event at host side. At guest side,
kvm_hypercall3(KVM_PERF_OP, KVM_PERF_OP_OPEN, ...) return the virtual address.
Guest os shouldn't use it but using it to pass it back to host kernel in following
hypercalls. It might be a security issue for host kernel. Originally, I planed guest
os not to use perf_event->shadow. Host kernel maintains a per-vcpu event-related
list whose key is addr of guest perf_event. But considering the vcpu thread migration
on logical cpu, such list needs lock and implementation becomes a little complicated.
I will double-check the list method as the security issue is there.
Need to detect the kvm pmu via its own cpuid bit.Ok. I will add a feature, KVM_FEATURE_PARA_PERF, something like
bit KVM_FEATURE_CLOCKSOURCE.
Ok, so event->shadow is never dereferenced. In that case better notHost kernel also uses it.
make it a pointer at all, keep it unsigned long.
Right. But it shouldn't be a problem as vcpu thread stops when vmexit to+This copy will not be atomic on 32-bit hosts.
+static void kvm_sync_event_to_guest(struct perf_event *event, int overflows)
+{
+ struct hw_perf_event *hwc =&event->hw;
+ struct kvmperf_event *kvmevent;
+ int offset, len, data_len, copied, page_offset;
+ struct page *event_page;
+ void *shared_kaddr;
+
+ kvmevent = event->shadow;
+ offset = kvmevent->event_offset;
+
+ /* Copy perf_event->count firstly */
+ offset += offsetof(struct perf_event, count);
+ if (offset< PAGE_SIZE) {
+ event_page = kvmevent->event_page;
+ page_offset = offset;
+ } else {
+ event_page = kvmevent->event_page2;
+ page_offset = offset - PAGE_SIZE;
+ }
+ shared_kaddr = kmap_atomic(event_page, KM_USER0);
+ *((atomic64_t *)(shared_kaddr + page_offset)) = event->count;
host to process events. In addition, only current cpu in guest accesses
perf_events linked to current cpu.
Above codes just run at host side. Is it possible that host kernel is 32 bit and+}Might truncate on 32-bit hosts. PAGE_MASK is 32-bit while addr is 64 bit.
+
+static struct perf_event *
+kvm_pv_perf_op_open(struct kvm_vcpu *vcpu, gpa_t addr)
+{
+ int ret;
+ struct perf_event *event;
+ struct perf_event *host_event = NULL;
+ struct kvmperf_event *shadow = NULL;
+
+ event = kzalloc(sizeof(*event), GFP_KERNEL);
+ if (!event)
+ goto out;
+
+ shadow = kzalloc(sizeof(*shadow), GFP_KERNEL);
+ if (!shadow)
+ goto out;
+
+ shadow->event_page = gfn_to_page(vcpu->kvm, addr>> PAGE_SHIFT);
+ shadow->event_offset = addr& ~PAGE_MASK;
guest kernel is 64bits?
Here calling kvm_read_guest is to get a copy of guest perf_event as it has+I assume this is to check existence?
+ ret = kvm_read_guest(vcpu->kvm, addr, event, sizeof(*event));
+ if (ret)
+ goto out;
perf_event.attr. Host need the attr to create host perf_event.
It doesn't work well with memoryThat's an issue. But host kernel couldn't go to sleep when processing event
hotplug. In general it's preferred to use
kvm_read_guest()/kvm_write_guest() instead of gfn_to_page() during
initialization to prevent pinning and allow for hotplug.
overflows under NMI context.
It may be better to have the guest create an id to the host, and thePerhaps the address of guest perf_event is the best id.
host can simply look up the id in a table: