Re: [PATCH 7/7] x86/intel_rdt: Add CAT documentation and usage guide

From: Vikas Shivappa
Date: Wed Apr 01 2015 - 14:22:11 EST




On Tue, 31 Mar 2015, Marcelo Tosatti wrote:

On Tue, Mar 31, 2015 at 10:27:32AM -0700, Vikas Shivappa wrote:


On Thu, 26 Mar 2015, Marcelo Tosatti wrote:


I can't find any discussion relating to exposing the CBM interface
directly to userspace in that thread ?

Cpu.shares is written in ratio form, which is much more natural.
Do you see any advantage in maintaining the

(ratio -> cbm bitmasks)

translation in userspace rather than in the kernel ?

What about something like:


root cgroup
/ \
/ \
/ \
cgroupA-80 cgroupB-30


So that whatever exceeds 100% is the ratio of cache
shared at that level (cgroup A and B share 10% of cache
at that level).

But this also means the 2 groups share all of the cache ?

Specifying the amount of bits to be shared lets you specify the
exact cache area where you want to share and also when your total
occupancy does not cover all of the cache. For ex: it gets more
complex when you want to share say only the left quarter of the
cache. cgroupA gets left half and cgroup gets left quarter. The
bitmask aligns with how the h/w is designed to share the cache which
gives you flexibility to define any specific overlapping areas of
the cache.

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-cpu_and_memory-use_case.html

cpu â the cpu.shares parameter determines the share of CPU resources
available to each process in all cgroups. Setting the parameter to 250,
250, and 500 in the finance, sales, and engineering cgroups respectively
means that processes started in these groups will split the resources
with a 1:1:2 ratio. Note that when a single process is running, it
consumes as much CPU as necessary no matter which cgroup it is placed
in. The CPU limitation only comes into effect when two or more processes
compete for CPU resources.



These are more defined in terms of how many cache lines (or how many
cache ways) they can use and would be difficult to define them in
terms of percentage. In contrast the cpu share is a time shared
thing and is much more granular where as here its not , its
occupancy in terms of cache lines/ways.. (however this is not really
defined as a restriction but thats the way it is now).
Also note that the granularity of the bitmasks define the
granularity of the percentages and in some SKUs the granularity is
2b and not 1b.. So technically you wont be able to even allocate
percentage of cache even in 10% granularity for most of the cases
(if there are 30MB and 25 ways like in one of hsw SKU) and this will
vary for different SKUs which makes it more complicated for users.
However the user library is free to define own interface based on
the underlying cgroup interface say for example you never care about
the overlapping and using it for a specific SKU etc.. The underlying
cgroup framework is meant to be generic for all SKus and used for
most of the use cases.

Also at this point I see a lot of enterprise and and other users
already using the cgroup interface or shown interest in the same.
However I see your point where you indicate the ease with which user
can specify in size/percentage which he might be used to doing for
other resources rather than bits where he needs to get an idea size
by calculating it seperately - But again note that you may not be
able to define percentages in many scenarios like the one above. And
another question would be we would need to convince the users to
adapt to the modified percentage user model (ex: like the one you
say above where percentage - 100 is the one thats shared)
I can review this requirements and others I have received and get
back to see the closest that can be done if possible.

Thanks,
Vikas

Vikas,

I see. Don't have anything against performing the translation in userspace
(i agree userspace should be able to allow ratios and specific
minimum/maximum counts). Can you please export the relevant information
in files in /sys or cgroups itself rather than requiring userspace to
parse CPUID etc? Including the EBX register from CPUID(EAX=10H, ECX=1),
which is necessary to implement "reserved LLC" properly.

The current interface is unable to handle the cross CPU case, though.
It would be necessary to expose per-socket masks.



Marcelo,

The current package supports per-socket updates to masks. Although the CLOSids are allocated globally just like in CMT and not per package.

The maximum bitmask is the root node's bitmask which is exposed already. The number of CLOSids are not exposed as kernel internally optimizes its usage and that should not end up giving a wrong picture for the user. For ex: if the number of CLOSids available is say 4 - the kernel could actually allocate them to more cgroups than just 4 cgroups , and this logic may change based on other features that my be added in the cgroup or depending on features available in the SKUs .. However with CAT cgroups an error is returned once kernel runs out of CLOSids. I am still reviewing this requirement with respect to the closids and will send an update soon.

Thanks,
Vikas