Re: [PATCH v9 2/6] mm: page_alloc: remain memblock_next_valid_pfn() on arm/arm64

From: Jia He
Date: Thu Jul 05 2018 - 21:38:53 EST



Hi Pavel, sorry for the late reply

On 6/30/2018 1:07 AM, Pavel Tatashin Wrote:
> On Thu, Jun 28, 2018 at 10:30 PM Jia He <hejianet@xxxxxxxxx> wrote:
>>
>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>> where possible") optimized the loop in memmap_init_zone(). But it causes
>> possible panic bug. So Daniel Vacek reverted it later.
>>
>> But as suggested by Daniel Vacek, it is fine to using memblock to skip
>> gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID.
>>
>> On arm and arm64, memblock is used by default. But generic version of
>> pfn_valid() is based on mem sections and memblock_next_valid_pfn() does
>> not always return the next valid one but skips more resulting in some
>> valid frames to be skipped (as if they were invalid). And that's why
>> kernel was eventually crashing on some !arm machines.
>
> Hi Jia,
>
> Is this a bug? Should we make other arches that support memblock to
> use memblock_is_map_memory() ? it is more expensive, but if the
> default is broken, maybe it makes sense to change?
>
IIUC, the bug is in memblock_next_valid_pfn instead of pfn_valid.
memblock_next_valid_pfn will return the incorrect next valid pfn on
!arm arches (e.g. X86). Please refer to b92df1de5.

Currently only arm/arm64 use MEMBLOCK_NOMAP, it is really beyond my
power to implement it on all other arches ;-)


--
Cheers,
Jia