Re: [PATCH] page_alloc.c: inline __rmqueue()

From: Anshuman Khandual
Date: Mon Oct 09 2017 - 03:37:58 EST


On 10/09/2017 11:14 AM, Aaron Lu wrote:
> __rmqueue() is called by rmqueue_bulk() and rmqueue() under zone->lock
> and that lock can be heavily contended with memory intensive applications.
>
> Since __rmqueue() is a small function, inline it can save us some time.
> With the will-it-scale/page_fault1/process benchmark, when using nr_cpu
> processes to stress buddy:
>
> On a 2 sockets Intel-Skylake machine:
> base %change head
> 77342 +6.3% 82203 will-it-scale.per_process_ops
>
> On a 4 sockets Intel-Skylake machine:
> base %change head
> 75746 +4.6% 79248 will-it-scale.per_process_ops
>
> This patch adds inline to __rmqueue().
>
> Signed-off-by: Aaron Lu <aaron.lu@xxxxxxxxx>

Ran it through kernel bench and ebizzy micro benchmarks. Results
were comparable with and without the patch. May be these are not
the appropriate tests for this inlining improvement. Anyways it
does not have any performance degradation either.

Reviewed-by: Anshuman Khandual <khandual@xxxxxxxxxxxxxxxxxx>
Tested-by: Anshuman Khandual <khandual@xxxxxxxxxxxxxxxxxx>