Re: [PATCH 05/32] x86/intel_rdt: Implement scheduling support for Intel RDT

From: David Carrillo-Cisneros
Date: Mon Jul 25 2016 - 18:48:38 EST


On Mon, Jul 25, 2016 at 11:05 AM Luck, Tony <tony.luck@xxxxxxxxx> wrote:
>
> On Mon, Jul 25, 2016 at 11:31:24AM -0500, Nilay Vaish wrote:
> > I was thinking more about this software caching of CLOSids. How
> > likely do you think these CLOSids would be found cached? I think the
> > software cache would be very infrequently accessed, so it seems you
> > are likely to miss these in all levels of cache hierarchy and more
> > likely to have to fetch these from the main memory, which itself might
> > cost ~250 cycles.
>
> We need to avoid reading the PQR_ASSOC MSR (which would cost far
> more than 250 cycles). Life is complicated here because this
> MSR contains the CLOSID in the upper half, and the RMID (owned
> by the perf code to measure cache occupancy and memory bandwidth)
> in the lower half.


On my Haswell machine, writing PQR_ASSOC_MSR takes about 380 cycles.
As Tony said, CQM/CMT writes to the same register, and it does it
twice (once to delete the old event, once to add the new one).

So, if a CQM/CMT or MBM is used with CAT, there will be 3 writes to
PQR_ASSOC_MSR per context switch and it's quite likely that the
software cache's cache line will be there for the 2 last writes.