Re: [PATCH v4 1/2] x86/setup: always add the beginning of RAM as memblock.memory

From: David Hildenbrand
Date: Mon Feb 01 2021 - 04:34:28 EST


On 30.01.21 23:10, Mike Rapoport wrote:
From: Mike Rapoport <rppt@xxxxxxxxxxxxx>

The physical memory on an x86 system starts at address 0, but this is not
always reflected in e820 map. For example, the BIOS can have e820 entries
like

[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000009ffff] usable

or

[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000000fff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000001000-0x0000000000057fff] usable

In either case, e820__memblock_setup() won't add the range 0x0000 - 0x1000
to memblock.memory and later during memory map initialization this range is
left outside any zone.

With SPARSEMEM=y there is always a struct page for pfn 0 and this struct
page will have it's zone link wrong no matter what value will be set there.

To avoid this inconsistency, add the beginning of RAM to memblock.memory.
Limit the added chunk size to match the reserved memory to avoid
registering memory that may be used by the firmware but never reserved at
e820__memblock_setup() time.

Fixes: bde9cfa3afe4 ("x86/setup: don't remove E820_TYPE_RAM for pfn 0")
Signed-off-by: Mike Rapoport <rppt@xxxxxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
---
arch/x86/kernel/setup.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 3412c4595efd..67c77ed6eef8 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -727,6 +727,14 @@ static void __init trim_low_memory_range(void)
* Kconfig help text for X86_RESERVE_LOW.
*/
memblock_reserve(0, ALIGN(reserve_low, PAGE_SIZE));
+
+ /*
+ * Even if the firmware does not report the memory at address 0 as
+ * usable, inform the generic memory management about its existence
+ * to ensure it is a part of ZONE_DMA and the memory map for it is
+ * properly initialized.
+ */
+ memblock_add(0, ALIGN(reserve_low, PAGE_SIZE));
}

/*


I think, to make that code more robust, and to not rely on archs to do the right thing, we should do something like

1) Make sure in free_area_init() that each PFN with a memmap (i.e., falls into a partial present section) is spanned by a zone; that would include PFN 0 in this case.

2) In init_zone_unavailable_mem(), similar to round_up(max_pfn, PAGES_PER_SECTION) handling, consider range
[round_down(min_pfn, PAGES_PER_SECTION), min_pfn - 1]
which would handle in the x86-64 case [0..0] and, therefore, initialize PFN 0.

Also, I think the special-case of PFN 0 is analogous to the round_up(max_pfn, PAGES_PER_SECTION) handling in init_zone_unavailable_mem(): who guarantees that these PFN above the highest present PFN are actually spanned by a zone?

I'd suggest going through all zone ranges in free_area_init() first, dealing with zones that have "not section aligned start/end", clamping them up/down if required such that no holes within a section are left uncovered by a zone.

--
Thanks,

David / dhildenb