Re: Cache Allocation Technology Design

From: Vikas Shivappa
Date: Wed Oct 29 2014 - 13:26:20 EST




On Fri, 24 Oct 2014, Peter Zijlstra wrote:

On Mon, Oct 20, 2014 at 05:18:55PM +0100, Matt Fleming wrote:
What is Cache Allocation Technology ( CAT )
-------------------------------------------

Its a horrible name is what it is, please consider using the old name,
that at least was clear in purpose.

Kernel implementation Overview
-------------------------------

Kernel implements a cgroup subsystem to support Cache Allocation.

Creating a CAT cgroup would create a new CLOS <-> CBM mapping. Each
cgroup would have one CBM and would just represent one cache 'subset'.

The user would be allowed to create as many directories as there are
CLOSs defined by the h/w. If user tries to create more than the
available CLOSs , -ENOSPC is returned. Currently we support only one
level of directory, ie directory can be created only under the root.

NAK, cgroups must support full hierarchies, simply enforce that the
child cgroup's mask is a subset of the parent's.

There are 2 modes supported

1. Affinitized mode : Each CAT cgroup is affinitized to a set of CPUs
specified by the 'cpus' file. The tasks in the CAT cgroup would be
constrained only on the CPUs in the 'cpus' file. The CPUs in this file
are exclusively used for this cgroup. Requests by task
using the sched_setaffinity() would be filtered through the tasks
'cpus'.

NAK, we will not have yet another cgroup mucking about with task
affinities.

These tasks would get to fill the LLC cache represented by the
cgroup's 'cbm' file. 'cpus' is a cpumask and works the same way as
the existing cpumask datastructure.

2. Non Affinitized mode : Each CAT cgroup(inturn 'subset') would be
for a group of tasks. There is no 'cpus' file and the CPUs that the
tasks run are not restricted by the CAT cgroup

It appears to me this 'mode' thing is entirely superfluous and can be
constructed by voluntary operation of this and cpusets or manual
affinity calls.

Do you mean user would would just user the cpusets for cpu affinity and CAT cgroup for cache allocation as shown in example below ?

In other words say affinitize the PID1 and PID2 to CPUs 1 and 2
and then set the desired cache allocation as well like below - then we have the desired cpu affinity and cache allocation for these PIDs..

cd /sys/fs/cgroup/cpuset

mkdir group1_specialuse
/bin/echo 1-2 > cpuset.cpus
/bin/echo PID1 > tasks
/bin/echo PID2 > tasks

Now come to CAT and do the cache allocation for the same tasks PID1 and PID2.

cd /sys/fs/cgroup/cat (CAT cgroup)

mkdir group1_specialuse (keeping same name just for understanding)
/bin/echo 0xf > cat.cbm (set the cache bit mask)
/bin/echo PID1 > tasks
/bin/echo PID2 > tasks




Assignment of CBM,CLOS and modes
---------------------------------

Root directory would have all bits in 'cbm' file by default.

The cbm_max file in the root defines the maximum number of bits
describing the available cache units. Say if cbm_max is 16 then the
'cbm' cannot have more than 16 bits.

This seems redundant, if you've already stated that the root cbm is the
full set, there is no need to further provide this.

The 'cbm' file is restricted to having no more than its cbm_max least
significant bits set. Any contiguous subset of these bits maybe set to
indication the cache mapping desired. The 'cbm' between 2 directories
can overlap. The 'cbm' would represent the cache 'subset' of the CAT
cgroup.

This would follow from the hierarchy requirement/conditions.

Scheduling and Context Switch
------------------------------

In non-affinitized mode the 'affinitized' is 0 , and the 'tasks' file
indicate the tasks the cache subset is affinitized to. When user adds
tasks to the tasks file , the tasks would get to fill the cache subset
represented by the CAT cgroup's 'cbm' file.

During context switch kernel implements this by writing the
corresponding CLOSid (internally maintained by kernel) of the CAT
cgroup to the CPU's IA32_PQR_ASSOC MSR.

Right.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/