Re: [PATCH 1/2] perf/x86/intel: enable CPU ref_cycles for GP counter

From: Stephane Eranian
Date: Tue May 23 2017 - 02:42:19 EST


On Mon, May 22, 2017 at 11:39 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Mon, May 22, 2017 at 12:28:26PM -0700, Stephane Eranian wrote:
>> On Mon, May 22, 2017 at 12:23 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> > On Mon, May 22, 2017 at 04:55:47PM +0000, Liang, Kan wrote:
>> >>
>> >>
>> >> > On Fri, May 19, 2017 at 10:06:21AM -0700, kan.liang@xxxxxxxxx wrote:
>> >> > > diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index
>> >> > > 580b60f..e8b2326 100644
>> >> > > --- a/arch/x86/events/core.c
>> >> > > +++ b/arch/x86/events/core.c
>> >> > > @@ -101,6 +101,10 @@ u64 x86_perf_event_update(struct perf_event
>> >> > *event)
>> >> > > delta = (new_raw_count << shift) - (prev_raw_count << shift);
>> >> > > delta >>= shift;
>> >> > >
>> >> > > + /* Correct the count number if applying ref_cycles replacement */
>> >> > > + if (!is_sampling_event(event) &&
>> >> > > + (hwc->flags & PERF_X86_EVENT_REF_CYCLES_REP))
>> >> > > + delta *= x86_pmu.ref_cycles_factor;
>> >> >
>> >> > That condition seems wrong, why only correct for !sampling events?
>> >> >
>> >>
>> >> For sampling, it's either fixed freq mode or fixed period mode.
>> >> - In the fixed freq mode, we should do nothing, because the adaptive
>> >> frequency algorithm will handle it.
>> >> - In the fixed period mode, we have already adjusted the period in
>> >> ref_cycles_rep().
>> >> Therefore, we should only handle !sampling events here.
>> >
>> > How so? For sampling events the actual event count should also be
>> > accurate.
>>
>> Yes, it must be. Because you can reconstruct the total number of
>> occurrences of the event by adding
>> all the periods recorded in each sample. So the period in each sample
>> must reflect user event and not
>> kernel event.
>
> Well, that, but you can equally use read() or the mmap()'ed rdpmc stuff
> on a sampling event. The fact that is also generates samples does not
> mean it should not also function as a non-sampling event.

Right, I did not even consider the rdpmc, but yeah, you will get a count that
is not relevant to the user visible event. Unless you fake it using the time
scaling fields there but that's ugly.