Re: [PATCH-cgroup v5 2/2] cgroup: Avoid false cacheline sharing of read mostly rstat_cpu

From: Tejun Heo
Date: Fri Dec 01 2023 - 12:38:26 EST


On Thu, Nov 30, 2023 at 03:43:27PM -0500, Waiman Long wrote:
> The rstat_cpu and also rstat_css_list of the cgroup structure are read
> mostly variables. However, they may share the same cacheline as the
> subsequent rstat_flush_next and *bstat variables which can be updated
> frequently. That will slow down the cgroup_rstat_cpu() call which is
> called pretty frequently in the rstat code. Add a CACHELINE_PADDING()
> line in between them to avoid false cacheline sharing.
>
> A parallel kernel build on a 2-socket x86-64 server is used as the
> benchmarking tool for measuring the lock hold time. Below were the lock
> hold time frequency distribution before and after the patch:
>
> Run time Before patch After patch
> -------- ------------ -----------
> 0-01 us 9,928,562 9,820,428
> 01-05 us 110,151 50,935
> 05-10 us 270 93
> 10-15 us 273 146
> 15-20 us 135 76
> 20-25 us 0 2
> 25-30 us 1 0
>
> It can be seen that the patch further pushes the lock hold time towards
> the lower end.
>
> Signed-off-by: Waiman Long <longman@xxxxxxxxxx>

Applied to cgroup/for-6.8.

Thanks.

--
tejun