Re: [PATCH] mm/sparse: Optimize section number calculations using bit shifts
From: Mike Rapoport
Date: Tue Jun 16 2026 - 02:32:22 EST
Hi,
On Tue, Jun 16, 2026 at 10:59:42AM +0800, Zhen Ni wrote:
> Add SECTIONS_PER_ROOT_SHIFT = ilog2(SECTIONS_PER_ROOT) with correctness
> guaranteed by BUILD_BUG_ON in sparse_init(). Convert SECTION_NR_TO_ROOT
> to use right shift instead of division for better performance. Add
> SECTION_NR_IN_ROOT() macro to improve code readability.
>
> This improves code efficiency in hot paths where __nr_to_section() is
> frequently called, such as sparse_init() and memory section management
> operations.
>
> Performance verification in sparse_init() on ARM (8GB RAM, 4 NUMA nodes):
>
> sparse_init()
> |
> +----> memblocks_present()
> |
> +----> section initialization (sparse_init_nid loop)
>
> Time measurement points:
>
> [T1] sparse_init start
> |
> v
> [T2] memblocks_present() complete
> |
> v
> [T3] sparse_init_nid() loop complete / sparse_init end
>
> Measurement values:
> memblocks_present_cycles = T2 - T1
> section_initialization_cycles = T3 - T2
> total_cycles = T3 - T1
>
> Before (division):
> [ 0.000000] sparse_init: total 7538 cycles
> [ 0.000000] memblocks_present: 4232 cycles
> [ 0.000000] section initialization: 3261 cycles
>
> After (bit shift):
> [ 0.000000] sparse_init: total 5641 cycles
> [ 0.000000] memblocks_present: 3562 cycles
> [ 0.000000] section initialization: 2057 cycles
>
> Performance improvement:
> Total: (7538-5641)/7538 = 25.2% faster
> memblocks_present: (4232-3562)/4232 = 15.8% faster
> section initialization: (3261-2057)/3261 = 36.9% faster
This is a nice improvement, but it's not the hot path. I believe you can
derive improvement to __nr_to_section() from these measurements.
> Signed-off-by: Zhen Ni <zhen.ni@xxxxxxxxxxxx>
> ---
> include/linux/mmzone.h | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 9adb2ad21da5..5daf471f6823 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -2035,11 +2035,14 @@ struct mem_section {
>
> #ifdef CONFIG_SPARSEMEM_EXTREME
> #define SECTIONS_PER_ROOT (PAGE_SIZE / sizeof (struct mem_section))
> +#define SECTIONS_PER_ROOT_SHIFT ilog2(SECTIONS_PER_ROOT)
> #else
> #define SECTIONS_PER_ROOT 1
> +#define SECTIONS_PER_ROOT_SHIFT 0
> #endif
>
> -#define SECTION_NR_TO_ROOT(sec) ((sec) / SECTIONS_PER_ROOT)
> +#define SECTION_NR_TO_ROOT(sec) ((sec) >> SECTIONS_PER_ROOT_SHIFT)
> +#define SECTION_NR_IN_ROOT(sec) ((sec) & SECTION_ROOT_MASK)
> #define NR_SECTION_ROOTS DIV_ROUND_UP(NR_MEM_SECTIONS, SECTIONS_PER_ROOT)
> #define SECTION_ROOT_MASK (SECTIONS_PER_ROOT - 1)
>
> @@ -2065,7 +2068,7 @@ static inline struct mem_section *__nr_to_section(unsigned long nr)
> if (!mem_section || !mem_section[root])
> return NULL;
> #endif
> - return &mem_section[root][nr & SECTION_ROOT_MASK];
> + return &mem_section[root][SECTION_NR_IN_ROOT(nr)];
The explicit masking is clearer IMO.
> }
> extern size_t mem_section_usage_size(void);
Hmm, I don't see BUILD_BUG_ON() you mention in the changelog.
> --
> 2.20.1
>
--
Sincerely yours,
Mike.