Re: [PATCH 13/32] Documentation, x86: Documentation for Intel resource allocation user interface

From: Luck, Tony
Date: Wed Jul 27 2016 - 12:57:46 EST


On Wed, Jul 27, 2016 at 11:20:31AM -0500, Nilay Vaish wrote:
> And over here you have switched to using CLOS ID and you do not
> mention Cache ID at all.
> As I said above, I think Cache ID and CLOS ID are the same thing. If
> that is the case, I think Cache ID should be completely replaced with
> CLOS ID.

Thanks for the input. We need to clarify things here.

cache id = unique number identifying a cache in the system. At current state
of the patch we only support L3 (a.k.a LLC) CAT, so the cache id is pretty
much the socket id.

CLOS ID = number we program into the PQR_ASSOC MSR to define the resources
available to the currently running process. This number indexes into the
arrays of bitmasks used to constrain the process. Currenlty just the L3_CBM
MSRs ... but when more resources are added, we use the same CLOS ID to index
all of them.

So if we have a schema file that says:

L3:0=00fff,1=ff000

it means a bunch of things:

1) This is a two socket system (since we have two L3 caches to control)

2) Processes assigned to this rdtgroup will be allowed to use different
amounts of cache when they run on cpus in each of the two sockets (the
"low" 60% on cache id 0 (socket 0) and the "high" 40% on cache id 1.

We can't tell from this which CLOS IDs the kernel decided to allocate
to implement this policy. If we just had the default rdtgroup and this
group as the only groups available, then it is likely that the kernel
will pick CLOS ID 1 on both sockets, and then assign the L3_CBM[1]
MSR on socket 0 with 0xfff, and the L3_CBM[1] MSR on socket 1 with the
value 0xff000

If the default rdtgroup still has all cache available
L3:0=fffff,1=fffff

and we add another rdtgroup and give it:

L3:0=fffff,1=ff000

the kernel will notice that we want this new rdtgroup to use the
same masks as exiting groups ... so it can use CLOS ID 0 on socket
0 (same as default) and CLOS ID 1 on socket 1 (same as the first
example I gave above).

We do this sharing because there are a limited number of CLOS ID
values (limited by the size of the array of L3_CBM MSRs.

Hope that is clearer (and we will make a new version of the Docs
for the next version)

-Tony