On Tue, Feb 06, 2018 at 12:58:23PM -0500, Liang, Kan wrote:
With the exception of handling 'empty' buffers, I ended up with the
below. Please try again.
There are two small errors. After fixing them, the patch works well.
Well, it still doesn't do A, two read()s without PEBS record in between.
So that needs fixing. What 3/5 does, call x86_perf_event_update() after
drain_pebs() is actively wrong after this patch.
+
+ /*
+ * Careful, not all hw sign-extends above the physical width
+ * of the counter.
+ */
+ delta = (new_raw_count << shift) - (prev_raw_count << shift);
+ delta >>= shift;
new_raw_count could be smaller than prev_raw_count.
The sign bit will be set. The delta>> could be wrong.
I think we can add a period here to prevent it.
+ delta = (period << shift) + (new_raw_count << shift) -
+ (prev_raw_count << shift);
+ delta >>= shift;
......
+ local64_add(delta + period * (count - 1), &event->count);
Right it does, but that wrecks case A again, because then we get here
with !@count.
Maybe something like:
s64 new, old;
new = ((s64)(new_raw_count << shift) >> shift);
old = ((s64)(old_raw_count << shift) >> shift);
local64_add(new - old + count * period, &event->count);
And then make intel_pmu_drain_pebs_*(), call this function even when !n.