Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages

From: David Hildenbrand
Date: Wed Apr 14 2021 - 11:54:31 EST


On 14.04.21 17:27, Ard Biesheuvel wrote:
On Wed, 14 Apr 2021 at 17:14, David Hildenbrand <david@xxxxxxxxxx> wrote:

On 07.04.21 19:26, Mike Rapoport wrote:
From: Mike Rapoport <rppt@xxxxxxxxxxxxx>

The struct pages representing a reserved memory region are initialized
using reserve_bootmem_range() function. This function is called for each
reserved region just before the memory is freed from memblock to the buddy
page allocator.

The struct pages for MEMBLOCK_NOMAP regions are kept with the default
values set by the memory map initialization which makes it necessary to
have a special treatment for such pages in pfn_valid() and
pfn_valid_within().

I assume these pages are never given to the buddy, because we don't have
a direct mapping. So to the kernel, it's essentially just like a memory
hole with benefits.

I can spot that we want to export such memory like any special memory
thingy/hole in /proc/iomem -- "reserved", which makes sense.

I would assume that MEMBLOCK_NOMAP is a special type of *reserved*
memory. IOW, that for_each_reserved_mem_range() should already succeed
on these as well -- we should mark anything that is MEMBLOCK_NOMAP
implicitly as reserved. Or are there valid reasons not to do so? What
can anyone do with that memory?

I assume they are pretty much useless for the kernel, right? Like other
reserved memory ranges.


On ARM, we need to know whether any physical regions that do not
contain system memory contain something with device semantics or not.
One of the examples is ACPI tables: these are in reserved memory, and
so they are not covered by the linear region. However, when the ACPI
core ioremap()s an arbitrary memory region, we don't know whether it
is mapping a memory region or a device region unless we keep track of
this in some way. (Device mappings require device attributes, but
firmware tables require memory attributes, as they might be accessed
using misaligned reads)

Using generically sounding NOMAP ("don't create direct mapping") to identify device regions feels like a hack. I know, it was introduced just for that purpose.

Looking at memblock_mark_nomap(), we consider "device regions"

1) ACPI tables

2) VIDEO_TYPE_EFI memory

3) some device-tree regions in of/fdt.c


IIUC, right now we end up creating a memmap for this NOMAP memory, but hide it away in pfn_valid(). This patch set at least fixes that.

Assuming these pages are never mapped to user space via the struct page (which better be the case), we could further use a new pagetype to mark these pages in a special way, such that we can identify them directly via pfn_to_page().

Then, we could mostly avoid having to query memblock at runtime to figure out that this is special memory. This would obviously be an extension to this series. Just a thought.

--
Thanks,

David / dhildenb