Re: [PATCH 2/3] perf/x86/mbm: Fix mbm counting for RMID reuse

From: Peter Zijlstra
Date: Tue May 10 2016 - 08:15:49 EST


On Fri, May 06, 2016 at 04:44:14PM -0700, Vikas Shivappa wrote:
> This patch tries to fix the issue where multiple perf instances try to
> monitor the same PID.

> MBM cannot count directly in the usual perf way of continuously adding
> the diff of current h/w counter and the prev count to the event->count
> because of some h/w dependencies:

And yet the patch appears to do exactly that; *confused*.

> (1) the mbm h/w counters overflow.

As do most other counters.. so your point is? You also have the software
timer < overflow period..

> (2) There are limited h/w RMIDs and hence we recycle the RMIDs due to
> which an event may count from different RMIDs.

This fails to explain why this is a problem.

> (3) Also we may not want to count at every sched_in and sched_out
> because the MSR reads involve quite a bit of overhead.

Every single other PMU driver just does this; why are you special?

You list 3 issues of why you think you cannot do the regular thing, but
completely fail to explain how these issues are a problem.

> However we try to do something similar to usual perf way in this patch
> and mainly handle (1) and (3).

> update_sample takes care of the overflow in the hardware counters and
> provides abstraction by returning total bytes counted as if there was no
> overflow. We use this abstraction to count as below:
>
> init:
> event->prev_count = update_sample(rmid) //returns current total_bytes
>
> count: // MBM right now uses count instead of read
> cur_count = update_sample(rmid)
> event->count += cur_count - event->prev_count
> event->prev_count = cur_count

So where does cqm_prev_count come from and why do you need it? What's
wrong with event->hw.prev_count ?

In fact, I cannot seem to find any event->hw.prev_count usage in this or
the next patch, so can we simply use that and not add pointless new
members?