RE: [PATCH 0/2] x86/intel_rdt and perf/x86: Fix lack of coordination with perf
From: Luck, Tony
Date: Tue Aug 07 2018 - 21:28:42 EST
Would it help to call routines to read the "before" values of the counter
twice. The first time to preload the cache with anything needed to execute
the perf code path.
>> In an attempt to improve the accuracy of the above I modified it to the
>> following:
>>
>> /* create the two events as before in "enabled" state */
>> l2_hit_pmcnum = l2_hit_event->hw.event_base_rdpmc;
>> l2_miss_pmcnum = l2_miss_event->hw.event_base_rdpmc;
>> local_irq_disable();
>> /* disable hw prefetchers */
>> /* init local vars to loop through pseudo-locked mem
* may take some misses in the perf code
*/
l2_hits_before = native_read_pmc(l2_hit_pmcnum);
l2_miss_before = native_read_pmc(l2_miss_pmcnum);
/* Read counters again, hope no new misses here */
>> l2_hits_before = native_read_pmc(l2_hit_pmcnum);
>> l2_miss_before = native_read_pmc(l2_miss_pmcnum);
>> /* loop through pseudo-locked mem */
>> l2_hits_after = native_read_pmc(l2_hit_pmcnum);
>> l2_miss_after = native_read_pmc(l2_miss_pmcnum);
>> /* enable hw prefetchers */
>> local_irq_enable();