Re: [PATCH v2 2/3] x86/mm: simplify calculation of max_pfn_mapped
From: Brendan Jackman
Date: Wed Jun 24 2026 - 08:37:00 EST
On Wed Jun 3, 2026 at 10:20 AM UTC, Brendan Jackman wrote:
> On Tue Jun 2, 2026 at 9:39 PM UTC, Dave Hansen wrote:
>> On 5/3/26 06:04, Brendan Jackman wrote:
>> ...
>>> Luckily, init_memory_mapping() avoids all these conditions. In that
>>> case, the return value is just paddr_end. And that value is already
>>> present, no need to depend on the confusing return value.
>>
>> It feels like we should say something about split_mem_range() here. All
>> of the guaranteed non-fiddly behavior originates in there, right?
>
> [pasting back the conditions from the commit message for context]
>>> but only in these conditions:
>>>
>>> 1. There is a mismatch between the alignment of the requested range and
>>> the page sizes allowed by page_size_mask
>>>
>>> 2. The range ends in a region that is not mapped according to
>>> e820.
>>>
>>> 3. The range ends in a region that was already mapped (note this case is
>>> particularly fiddly because the return value depends on what level
>>> the existing mapping is at. This is probably a bug, see [0] for
>>> discussion).
>
> split_mem_range() is responsible for excluding point 1, since it returns
> the correct page_size_mask. The other two are actually down to the
> callers, right?
>
> So how about for point 1 I mention that in the commit message, then for
> points 2 and 3 maybe they should actually be code comments, i.e.
> documented as preconditions for calling init_memory_mapping()?
For posterity: I realised that treating point 2 as a separate case is
bogus here. I was looking at the e820__mapped_any() blocks in
phys_*_init() and noting that they don't update the local paddr_last.
But actualy, those blocks only run for paddr>=paddr_end, which can
already only happen in case 1 i.e. when the range is misaligned wrt
page_size_mask.
So I'm just gonna drop that bit.
This realisation is really reinforcing that removing this return value
is the right thing to do.
>>> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
>>> index ae3e9e0820153..1a6a6fc700bb5 100644
>>> --- a/arch/x86/mm/init.c
>>> +++ b/arch/x86/mm/init.c
>>> @@ -544,10 +544,11 @@ void __ref init_memory_mapping(unsigned long start,
>>> memset(mr, 0, sizeof(mr));
>>> nr_range = split_mem_range(mr, 0, start, end);
>>>
>>> - for (i = 0; i < nr_range; i++)
>>> - paddr_last = kernel_physical_mapping_init(mr[i].start, mr[i].end,
>>> - mr[i].page_size_mask,
>>> - prot);
>>> + for (i = 0; i < nr_range; i++) {
>>> + kernel_physical_mapping_init(mr[i].start, mr[i].end,
>>> + mr[i].page_size_mask, prot);
>>> + paddr_last = mr[i].end;
>>> + }
>>
>> I guess this is actually:
>>
>> for (i = 0; i < nr_range; i++)
>> kernel_physical_mapping_init(...);
>>
>> paddr_last = mr[nr_range-1].end;
>>
>> Right? But what you have is probably just as compact.
>
> Oh, weird. My code might be just as compact but it's confusing, it
> should be written your way for sure.