Re: [PATCH 04/32] x86/intel_rdt: Add L3 cache capacity bitmask management

From: Shivappa Vikas
Date: Tue Jul 26 2016 - 13:09:39 EST




On Sat, 23 Jul 2016, Marcelo Tosatti wrote:

On Fri, Jul 22, 2016 at 02:43:23PM -0700, Luck, Tony wrote:
On Fri, Jul 22, 2016 at 04:12:04AM -0300, Marcelo Tosatti wrote:
How does this patchset handle the following condition:

6) Create reservations in such a way that the sum is larger than
total amount of cache, and CPU pinning (example from Karen Noel):

VM-1 on socket-1 with 80% of reservation.
VM-2 on socket-2 with 80% of reservation.
VM-1 pinned to socket-1.
VM-2 pinned to socket-2.

That's legal, but perhaps we need a description of
overlapping cache reservations.

Hardware tells you how finely you can divide the cache (and this
information is shown in /sys/fs/resctrl/info/l3/max_cbm_len to save
you from digging in CPUID leaves). E.g. on Broadwell the value is
20, so you can control cache allocations in 5% slices.

A bitmask defines which slices you can use (and h/w has the restriction
that you must have contiguous '1' bits in any mask). So you can pick
your 80% using 0x0ffff, 0x1fffe, 0x3fffc, 0x7fff8 or 0xffff0.

There is no requirement that masks be exclusive of each other. So
you might pick the two extremes: 0x0ffff and 0xffff0 for your two
VM's in this example. Each would be allowed to allocate up to 80%,
but with a big overlap in the middle. Each has 20% exclusive, but
there is a 60% range in the middle that they would compete for.

This are different sockets, so there is no competing/sharing of L3 cache
here: the question is about whether the interface allows the
user to specify that 80/80 reservation without complaining:
because the VM's are pinned, they will never actually
share the same L3 cache.

(haven't finished reading the patchset to be certain).

This series adds the per-socket support (See 23/32) - which will be folded into these which will should make it easier to read like Thomas suggested. The first 12 patches are the same as the old ones with the cgroup interface..
with the cgroup we were stuck with the atomicity issue to support per-socket because there was no way to define an interface to guarentee atomicity to the user when he wants an allocation of masks across different sockets

(basically he could end up in a situation where he could get a mask on one socket but not able to get it on an other as he ran out of closids, now the interface does it all at once - requiring the user to specify the masks for all sockets if you see the interface in later patches..) You could still do the above case you say with cgroup by co-mounting cpusets
and global masks but end up wasting closids.

Thanks,
Vikas