RE: [PATCH v2] mm/vmstat: Defer the refresh_zone_stat_thresholds after all CPUs bringup

From: Saurabh Singh Sengar
Date: Fri Aug 23 2024 - 05:30:41 EST




> -----Original Message-----
> From: Saurabh Sengar <ssengar@xxxxxxxxxxxxxxxxxxx>
> Sent: 12 August 2024 11:44
> To: akpm@xxxxxxxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx
> Cc: Saurabh Singh Sengar <ssengar@xxxxxxxxxxxxx>; wei.liu@xxxxxxxxxx;
> srivatsa@xxxxxxxxxxxxx
> Subject: [PATCH v2] mm/vmstat: Defer the refresh_zone_stat_thresholds after
> all CPUs bringup
>
> refresh_zone_stat_thresholds function has two loops which is expensive for
> higher number of CPUs and NUMA nodes.
>
> Below is the rough estimation of total iterations done by these loops based on
> number of NUMA and CPUs.
>
> Total number of iterations: nCPU * 2 * Numa * mCPU
> Where:
> nCPU = total number of CPUs
> Numa = total number of NUMA nodes
> mCPU = mean value of total CPUs (e.g., 512 for 1024 total CPUs)
>
> For the system under test with 16 NUMA nodes and 1024 CPUs, this results in
> a substantial increase in the number of loop iterations during boot-up when
> NUMA is enabled:
>
> No NUMA = 1024*2*1*512 = 1,048,576 : Here refresh_zone_stat_thresholds
> takes around 224 ms total for all the CPUs in the system under test.
> 16 NUMA = 1024*2*16*512 = 16,777,216 : Here
> refresh_zone_stat_thresholds takes around 4.5 seconds total for all the CPUs
> in the system under test.
>
> Calling this for each CPU is expensive when there are large number of CPUs
> along with multiple NUMAs. Fix this by deferring
> refresh_zone_stat_thresholds to be called later at once when all the
> secondary CPUs are up. Also, register the DYN hooks to keep the existing
> hotplug functionality intact.
>
> Signed-off-by: Saurabh Sengar <ssengar@xxxxxxxxxxxxxxxxxxx>

CC: Mel Gorman and Christoph Lameter