Re: [RFC,PATCH] VMWARE faults on accessing disabled counters

From: Jiri Olsa
Date: Wed Aug 31 2016 - 09:19:30 EST


On Wed, Aug 31, 2016 at 03:11:04PM +0200, Peter Zijlstra wrote:
> On Wed, Aug 31, 2016 at 02:03:58PM +0200, Jiri Olsa wrote:
> > hi,
> > when booting under VMWARE we've got following dmesg lines:
> >
> > [ 0.051567] perf_event_intel: CPUID marked event: 'cpu cycles' unavailable
> > [ 0.051567] perf_event_intel: CPUID marked event: 'instructions' unavailable
> > [ 0.051568] perf_event_intel: CPUID marked event: 'bus cycles' unavailable
> > [ 0.051568] perf_event_intel: CPUID marked event: 'cache references' unavailable
> > [ 0.051569] perf_event_intel: CPUID marked event: 'cache misses' unavailable
> > [ 0.051570] perf_event_intel: CPUID marked event: 'branch instructions' unavailable
> > [ 0.051570] perf_event_intel: CPUID marked event: 'branch misses' unavailable
> >
> > that means all the architectural events are disabled by CPUID(0xa)
> >
> > The kernel code sets intel_perfmon_event_map to prevent
> > those event to be configured by PERF_TYPE_HARDWARE pmu
> > type. However they can still be configured by via
> > PERF_TYPE_RAW type.
> >
> > We're getting GP fault on VMWARE when reading cycles PMC
> > configured throgh the PERF_TYPE_RAW interface:
> >
> > #4 [ffff88007c603e10] do_general_protection at ffffffff8163da9e
> > #5 [ffff88007c603e40] general_protection at ffffffff8163d3a8
> > [exception RIP: native_read_pmc+6]
> > RIP: ffffffff81058d66 RSP: ffff88007c603ef0 RFLAGS: 00010083
> > RAX: ffffffff81957ee0 RBX: 0000000000000000 RCX: 0000000040000002
> > RDX: 000000000ff8f719 RSI: ffff88007c617fa8 RDI: 0000000040000002
> > RBP: ffff88007c603ef0 R8: 00007ffde5053150 R9: 0000000000000000
> > R10: 00007ffde5052530 R11: 00007fbb22aedc70 R12: ffffffff80000001
> > R13: ffff880079b74400 R14: ffff880079b74578 R15: 0000000000000010
> > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000
> > #6 [ffff88007c603ef8] x86_perf_event_update at ffffffff81029e03
> > #7 [ffff88007c603f30] x86_pmu_read at ffffffff8102a079
> > #8 [ffff88007c603f40] __perf_event_read at ffffffff811590de
> >
> > I couldn't find what real HW rdpmc does on this situation,
> > so I'm not sure if we actually want to prevent this.. patch
> > below tries to catch this case.
>
> Typically real hardware allows you to program any old crap. The results,
> as in what the counter does, is undefined. Some actually count, some do
> not.
>
> I'm not exactly thrilled by this patch, it adds a lot of code for a
> weird case. What happens when you stuff another non existing even in? GP
> again?

I guess if real HW does not fault on this we dont need to bother,
and treat it as the VMWARE issue.. but I couldn't find this info

jirka