Re: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes

From: Thomas Gleixner
Date: Thu Jan 19 2017 - 12:42:37 EST

Next message: David Hildenbrand: "Re: [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration"
Previous message: Rob Herring: "Re: [RFC v2 4/5] DT bindings documentation for Synopsys UDC platform driver"
In reply to: David Carrillo-Cisneros: "Re: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes"
Next in thread: David Carrillo-Cisneros: "Re: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, 18 Jan 2017, David Carrillo-Cisneros wrote:
> On Wed, Jan 18, 2017 at 12:53 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> There are use cases where the RMID to CLOSID mapping is not that simple.
> Some of them are:
>
> 1. Fine-tuning of cache allocation. We may want to have a CLOSID for a thread
> during phases that initialize relevant data, while changing it to another during
> phases that pollute cache. Yet, we want the RMID to remain the same.

That's fine. I did not say that you need fixed RMD <-> CLOSID mappings. The
point is that monitoring across different CLOSID domains is pointless.

I have no idea how you want to do that with the proposed implementation to
switch the RMID of the thread on the fly, but that's a different story.

> A different variation is to change CLOSID to increase/decrease the size of the
> allocated cache when high/low contention is detected.
>
> 2. Contention detection. I start with:
> - T1 has RMID 1.
> - T1 changes RMID to 2.
> will expect llc_occupancy(1) to decrease while llc_occupancy(2) increases.

Of course does RMID1 decrease because it's not longer in use. Oh well.

> The rate of change will be relative to the level of cache contention present
> at the time. This all happens without changing the CLOSID.

See above.

> >
> > So when I monitor CPU4, i.e. CLOSID 1 and T1 runs on CPU4, then I do not
> > care at all about the occupancy of T1 simply because that is running on a
> > seperate reservation.
>
> It is not useless for scenarios where CLOSID and RMIDs change dynamically
> See above.

Above you are talking about the same CLOSID and different RMIDS and not
about changing both.

> > Trying to make that an aggregated value in the first
> > place is completely wrong. If you want an aggregate, which is pretty much
> > useless, then user space tools can generate it easily.
>
> Not useless, see above.

It is prettey useless, because CPU4 has CLOSID1 while T1 has CLOSID4 and
making an aggregate over those two has absolutely nothing to do with your
scenario above.

If you want the aggregate value, then create it in user space and oracle
(or should I say google) out of it whatever you want, but do not impose
that to the kernel.

> Having user space tools to aggregate implies wasting some of the already
> scarce RMIDs.

Oh well. Can you please explain how you want to monitor the scenario I
explained above:

CPU4 CLOSID 1
T1 CLOSID 4

So if T1 runs on CPU4 then it uses CLOSID 4 which does not at all affect
the cache occupancy of CLOSID 1. So if you use the same RMID then you
pollute either the information of CPU4 (CLOSID1) or the information of T1
(CLOSID4)

To gather any useful information for both CPU1 and T1 you need TWO
RMIDs. Everything else is voodoo and crystal ball analysis and we are not
going to support that.

> > The whole approach you and David have taken is to whack some desired cgroup
> > functionality and whatever into CQM without rethinking the overall
> > design. And that's fundamentaly broken because it does not take cache (and
> > memory bandwidth) allocation into account.
>
> Monitoring and allocation are closely related yet independent.

Independent to some degree. Sure you can claim they are completely
independent, but lots of the resulting combinations make absolutely no
sense at all. And we really don't want to support non-sensical measurements
just because we can. The outcome of this is complexity, inaccuracy and code
which is too horrible to look at.

> I see the advantages of allowing a per-cpu RMID as you describe in the example.
>
> Yet, RMIDs and CLOSIDs should remain independent to allow use cases beyond
> one simply monitoring occupancy per allocation.

I agree there are use cases where you want to monitor across allocations,
like monitoring a task which has no CLOSID assigned and runs on different
CPUs and therefor potentially on different CLOSIDs which are assigned to
the different CPUs.

That's fine and you want a seperate RMID for this.

But once you have a fixed CLOSID association then reusing and aggregating
across CLOSID domains is more than useless.

> > I seriously doubt, that the existing CQM/MBM code can be refactored in any
> > useful way. As Peter Zijlstra said before: Remove the existing cruft
> > completely and start with completely new design from scratch.
> >
> > And this new design should start from the allocation angle and then add the
> > whole other muck on top so far its possible. Allocation related monitoring
> > must be the primary focus, everything else is just tinkering.
>
> Assuming that my stated need for more than one RMID per CLOSID or more
> than one CLOSID per RMID is recognized, what would be the advantage of
> starting the design of monitoring from the allocation perspective?
>
> It's quite doable to create a new version of CQM/CMT without all the
> cgroup murk.
>
> We can also create an easy way to open events to monitor CLOSIDs. Yet, I
> don't see the advantage of dissociating monitoring from perf and directly
> building in on top of allocation without the assumption of 1 CLOSID : 1
> RMID.

I did not say that you need to remove it from perf. perf is still going to
be the interface to interact with monitoring, but it needs to be done in a
way which makes sense. The current cgroup focussed proposal which is
completely oblivious of the allocation mechanism does not make any sense to
me at all.

Starting the design from the allocation POV makes a lot of sense because
that's the point where you start to make the decisions about useful and
useless monitoring choices. And limiting the choices is the best way to
limit the RMID exhaustion in the first place.

Thanks,

tglx

Next message: David Hildenbrand: "Re: [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration"
Previous message: Rob Herring: "Re: [RFC v2 4/5] DT bindings documentation for Synopsys UDC platform driver"
In reply to: David Carrillo-Cisneros: "Re: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes"
Next in thread: David Carrillo-Cisneros: "Re: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]