Re: [PATCH] arm64: mm: Fix memmap to be initialized for the entire section

From: Ard Biesheuvel
Date: Wed Nov 23 2016 - 16:25:13 EST


On 23 November 2016 at 21:15, Robert Richter <robert.richter@xxxxxxxxxx> wrote:
> On 20.11.16 17:07:44, Ard Biesheuvel wrote:
>> On 17 November 2016 at 15:18, Robert Richter <robert.richter@xxxxxxxxxx> wrote:
>
>> > The risk of breaking something with my patch is small and limited only
>> > to the mapping of efi reserved regions (which is the state of 4.4). If
>> > something breaks anyway it can easily be fixed by adding more checks
>> > to pfn_valid() as suggested above.
>> >
>>
>> As I noted before, it looks to me like setting CONFIG_HOLES_IN_ZONE is
>> the correct way to address this. However, doing that does uncover a
>> bug in move_freepages() where the VM_BUG_ON_PAGE() dereferences struct
>> page fields before the pfn_valid_within() check, so it seems those
>> need to be switched around.
>>
>> Robert, you mentioned that CONFIG_HOLES_IN_ZONE seems inappropriate
>> for sparsemem. Care to elaborate why?
>
> HOLES_IN_ZONE is of rare use in the kernel. I think it was introduced
> to save memory for the memmap and only some single systems enable it.
> There is no architecture that enables it entirely. For good reasons...
>
> It introduces additional checks. pfn_valid() is usually checked only
> once for the whole memmap. There are a number of checks enabled, just
> grep for pfn_valid_within. This will increase the number of
> pfn_valid() calls by a factor of MAX_ORDER_NR_PAGES, in my config this
> is 8k. So, this is not the direction to go.
>

That does sound like a potential issue. But does it cause any slowdown
in practice?

The reality is that CONFIG_HOLES_IN_ZONE perfectly describes the
situation, and so it is still my preferred option if the performance
hit is tolerable.

> My patch fixes a regression in the kernel that was introduced by the
> nomap implementation. Some systems can not boot anymore, beside of
> that the BUG_ON() may occur any time depending only on physical page
> access, we need to fix 4.9. Here is a reproducer:
>
> https://patchwork.kernel.org/patch/9407677/
>
> My patch also does not break memremap(). With my patch applied
> try_ram_remap() would return a linear addr for nomap regions. But this
> is only called if WB is explicitly requested, so it should not happen.

Why? MEMREMAP_WB is used often, among other things for mapping
firmware tables, which are marked as NOMAP, so in these cases, the
linear address is not mapped.

> If you think pfn_valid() is wrong here, I am happy to send a patch
> that fixes this by using page_is_ram(). In any case, the worst case
> that may happen is to behave the same as v4.4, we might fix then the
> wrong use of pfn_valid() where it is not correctly used to check for
> ram.
>

page_is_ram() uses string comparisons to look for regions called
'System RAM'. Is that something we can tolerate for each pfn_valid()
calll?

Perhaps the solution is to reimplement page_is_ram() for arm64 using
memblock_is_memory() instead, But that still means we need to modify
the generic memremap() code first to switch to it before changing the
arm64 implementation of pfn_valid