Re: [PATCH v2 2/5] mm: Extends local cpu counter vm_diff_nodestat from s8 to s16

From: Michal Hocko
Date: Tue Dec 19 2017 - 11:20:47 EST

On Tue 19-12-17 10:05:48, Cristopher Lameter wrote:
> On Tue, 19 Dec 2017, Kemi Wang wrote:
> > The type s8 used for vm_diff_nodestat[] as local cpu counters has the
> > limitation of global counters update frequency, especially for those
> > monotone increasing type of counters like NUMA counters with more and more
> > cpus/nodes. This patch extends the type of vm_diff_nodestat from s8 to s16
> > without any functionality change.
> Well the reason for s8 was to keep the data structures small so that they
> fit in the higher level cpu caches. The large these structures become the
> more cachelines are used by the counters and the larger the performance
> influence on the code that should not be impacted by the overhead.

I am not sure I understand. We usually do not access more counters in
the single code path (well, PGALLOC and NUMA counteres is more of an
exception). So it is rarely an advantage that the whole array is in the
same cache line. Besides that this is allocated by the percpu allocator
aligns to the type size rather than cache lines AFAICS.

Maybe it used to be all different back then when the code has been added
but arguing about cache lines seems to be a bit problematic here. Maybe
you have some specific workloads which can prove me wrong?
Michal Hocko