Almost no difference

From: Hyeonggon Yoo
Date: Sat Oct 09 2021 - 07:46:02 EST


On Sat, Oct 09, 2021 at 01:33:43AM +0100, Matthew Wilcox wrote:
> On Sat, Oct 09, 2021 at 12:19:03AM +0000, Hyeonggon Yoo wrote:
> > - Is there a reason that SLUB does not implement cache coloring?
> > it will help utilizing hardware cache. Especially in block layer,
> > they are literally *squeezing* its performance now.
>
> Have you tried turning off cache colouring in SLAB and seeing if
> performance changes? My impression is that it's useful for caches
> with low associativity (direct mapped / 2-way / 4-way), but loses
> its effectiveness for caches with higher associativity. For example,
> my laptop:
>
> L1 Data Cache: 48KB, 12-way associative, 64 byte line size
> L1 Instruction Cache: 32KB, 8-way associative, 64 byte line size
> L2 Unified Cache: 1280KB, 20-way associative, 64 byte line size
> L3 Unified Cache: 12288KB, 12-way associative, 64 byte line size
>
> I very much doubt that cache colouring is still useful for this machine.

On my machine,
L1 Data Cache: 32KB, 8-way associative, 64 byte line size
L1 Instruction Cache: 32KB, 8-way associative, 64 byte line size
L2 Unified Cache: 1MB, 16-way associative, 64 byte line size
L3 Unified Cache: 33MB, 11-way associative, 64 byte line size


I run hackbench with per-node coloring, per-cpu coloring, and without
coloring.

hackbench -g 100 -l 200000
without coloring: 2196.787
with per-node coloring: 2193.607
with per-cpu coloring: 2198.076

it seems there is almost no difference.
How much difference did you seen low associativity processors?

Hmm... I'm gonna search if there's related paper.