Re: [PATCH V4 04/23] perf/x86/intel: Support adaptive PEBSv4

From: Liang, Kan
Date: Wed Mar 27 2019 - 10:25:52 EST




On 3/26/2019 6:24 PM, Andi Kleen wrote:
+ for (at = base; at < top; at += cpuc->pebs_record_size) {
+ u64 pebs_status;
+
+ pebs_status = get_pebs_status(at) & cpuc->pebs_enabled;
+ pebs_status &= mask;
+
+ for_each_set_bit(bit, (unsigned long *)&pebs_status, size)
+ counts[bit]++;
+ }

On Icelake pebs_status is always reliable, so I don't think we need
the two pass walking.


We need to call perf_event_overflow() for the last record of each event. It's hard to detect which record is the last record of the event with one pass walking.

Also, I'm not sure how much we can save with one pass walking. The optimization should only benefit large PEBS. The total number of records for large PEBS should not be huge.
I will evaluate the performance impact of one pass walking. If there is observed performance improvement, I will submit a separate patch later.

For now, I think we can still use the mature two pass walking method.

Thanks,
Kan

-Andi

+
+ for (bit = 0; bit < size; bit++) {
+ if (counts[bit] == 0)
+ continue;
+
+ event = cpuc->events[bit];
+ if (WARN_ON_ONCE(!event))
+ continue;
+
+ if (WARN_ON_ONCE(!event->attr.precise_ip))
+ continue;
+
+ __intel_pmu_pebs_event(event, iregs, base,
+ top, bit, counts[bit],
+ setup_pebs_adaptive_sample_data);
+ }
+}