Re: [PATCH] mm/sparse: fix BUILD_BUG_ON check for section map alignment

From: Muchun Song

Date: Tue Mar 31 2026 - 22:58:55 EST




> On Apr 1, 2026, at 04:29, David Hildenbrand (Arm) <david@xxxxxxxxxx> wrote:
>
> On 3/31/26 13:30, Muchun Song wrote:
>> The comment in mmzone.h states that the alignment requirement
>> is the minimum of PAGE_SHIFT and PFN_SECTION_SHIFT. However, the
>> pointer arithmetic (mem_map - section_nr_to_pfn()) results in
>> a byte offset scaled by sizeof(struct page). Thus, the actual
>> alignment provided by the second term is PFN_SECTION_SHIFT +
>> __ffs(sizeof(struct page)).
>>
>> Update the compile-time check and the mmzone.h comment to
>> accurately reflect this mathematically guaranteed alignment by
>> taking the minimum of PAGE_SHIFT and PFN_SECTION_SHIFT +
>> __ffs(sizeof(struct page)). This avoids the issue of the check
>> being overly restrictive on architectures like powerpc where
>> PFN_SECTION_SHIFT alone is very small (e.g., 6).
>>
>> Also, remove the exhaustive per-architecture bit-width list from the
>> comment; such details risk falling out of date over time and may
>> inadvertently be left un-updated, while the existing BUILD_BUG_ON
>> provides sufficient compile-time verification of the constraint.
>>
>> No runtime impact so far: SECTION_MAP_LAST_BIT happens to fit within
>> the smaller limit on all existing architectures.
>>
>> Fixes: def9b71ee651 ("include/linux/mmzone.h: fix explanation of lower bits in the SPARSEMEM mem_map pointer")
>> Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx>
>> ---
>> include/linux/mmzone.h | 24 +++++++++---------------
>> mm/sparse.c | 3 ++-
>> 2 files changed, 11 insertions(+), 16 deletions(-)
>>
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index 7bd0134c241c..584fa598ad75 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -2073,21 +2073,15 @@ static inline struct mem_section *__nr_to_section(unsigned long nr)
>> extern size_t mem_section_usage_size(void);
>>
>> /*
>> - * We use the lower bits of the mem_map pointer to store
>> - * a little bit of information. The pointer is calculated
>> - * as mem_map - section_nr_to_pfn(pnum). The result is
>> - * aligned to the minimum alignment of the two values:
>> - * 1. All mem_map arrays are page-aligned.
>> - * 2. section_nr_to_pfn() always clears PFN_SECTION_SHIFT
>> - * lowest bits. PFN_SECTION_SHIFT is arch-specific
>> - * (equal SECTION_SIZE_BITS - PAGE_SHIFT), and the
>> - * worst combination is powerpc with 256k pages,
>> - * which results in PFN_SECTION_SHIFT equal 6.
>> - * To sum it up, at least 6 bits are available on all architectures.
>> - * However, we can exceed 6 bits on some other architectures except
>> - * powerpc (e.g. 15 bits are available on x86_64, 13 bits are available
>> - * with the worst case of 64K pages on arm64) if we make sure the
>> - * exceeded bit is not applicable to powerpc.
>> + * We use the lower bits of the mem_map pointer to store a little bit of
>> + * information. The pointer is calculated as mem_map - section_nr_to_pfn().
>> + * The result is aligned to the minimum alignment of the two values:
>> + *
>> + * 1. All mem_map arrays are page-aligned.
>> + * 2. section_nr_to_pfn() always clears PFN_SECTION_SHIFT lowest bits. Because
>> + * it is subtracted from a struct page pointer, the offset is scaled by
>> + * sizeof(struct page). This provides an alignment of PFN_SECTION_SHIFT +
>> + * __ffs(sizeof(struct page)).
>> */
>> enum {
>> SECTION_MARKED_PRESENT_BIT,
>> diff --git a/mm/sparse.c b/mm/sparse.c
>> index dfabe554adf8..c2eb36bfb86d 100644
>> --- a/mm/sparse.c
>> +++ b/mm/sparse.c
>> @@ -269,7 +269,8 @@ static unsigned long sparse_encode_mem_map(struct page *mem_map, unsigned long p
>> {
>> unsigned long coded_mem_map =
>> (unsigned long)(mem_map - (section_nr_to_pfn(pnum)));
>> - BUILD_BUG_ON(SECTION_MAP_LAST_BIT > PFN_SECTION_SHIFT);
>> + BUILD_BUG_ON(SECTION_MAP_LAST_BIT > min(PFN_SECTION_SHIFT + __ffs(sizeof(struct page)),
>> + PAGE_SHIFT));
>
> If that would trigger, wouldn't the memmap of a memory section be
> smaller than a single page?

I don't think a memory section can be smaller than a page, because
PFN_SECTION_SHIFT is defined as follows:

#define PFN_SECTION_SHIFT (SECTION_SIZE_BITS - PAGE_SHIFT)

Therefore, PFN_SECTION_SHIFT must be greater than PAGE_SHIFT. On powerpc,
PFN_SECTION_SHIFT is 6, PAGE_SHIFT is 18 (the worst combination).

Sorry, but I didn't understand what your concern is. Could you elaborate
a bit more?

>
> Is this really something we should be concerned about? :)
>

When we continuously increase SECTION_MAP_LAST_BIT, it may trigger issues,
because I expect to catch problems as early as possible at compile time. That
was the motivation behind my change.

Thanks.

> --
> Cheers,
>
> David