Re: [PATCH 2/3] perf/x86/mbm: Fix mbm counting for RMID reuse

From: Peter Zijlstra
Date: Wed May 11 2016 - 03:24:01 EST


On Tue, May 10, 2016 at 04:39:39PM +0000, Luck, Tony wrote:
> >> (3) Also we may not want to count at every sched_in and sched_out
> >> because the MSR reads involve quite a bit of overhead.
> >
> > Every single other PMU driver just does this; why are you special?
>
> They just have to read a register. We have to write the IA32_EM_EVT_SEL MSR
> and then read from the IA32_QM_CTR MSR ... if we are tracking both local
> and total bandwidth, we have to do repeat and wrmr/rdmsr again to get the
> other counter. That seems like it will noticeably affect the system if we do it
> on every sched_in and sched_out.

Right; but Vikas didn't say that did he ;-), he just mentioned msr-read.

Also; I don't think you actually have to do it on every sched event,
only when the event<->rmid association changes. As long as the
event<->rmid association doesn't change, you can forgo updates.

> But the more we make this complicated, the more I think that we should not
> go through the pain of stealing/recycling RMIDs and just limit the number of
> things that can be simultaneously monitored. If someone tries to monitor one
> more thing when all the RMIDs are in use, we should just error out with
> -ERUNOUTOFRMIDSTRYAGAINLATER (maybe -EAGAIN???)

Possibly; but I would like to minimize churn at this point to let the
Google guys get their patches in shape. They seem to have definite ideas
about that as well :-)