Re: [PATCH v4 1/2] x86/setup: always add the beginning of RAM as memblock.memory

From: Mike Rapoport
Date: Sun Jan 31 2021 - 03:13:08 EST


On Sat, Jan 30, 2021 at 04:37:54PM -0800, Linus Torvalds wrote:
> On Sat, Jan 30, 2021 at 2:10 PM Mike Rapoport <rppt@xxxxxxxxxx> wrote:
> >
> > In either case, e820__memblock_setup() won't add the range 0x0000 - 0x1000
> > to memblock.memory and later during memory map initialization this range is
> > left outside any zone.
>
> Honestly, this just sounds like memblock being stupid in the first place.
>
> Why aren't these zones padded to sane alignments?

The implicit alignment of zones would be a guess. What alignment would be
sane here? 1M? MAX_ORDER? pageblock_order?

I'm not sure that if an architecture reports its memory at X and we use,
say, round_down(X, 1M) for node[0]->node_start_pfn and
zone[0]->zone_start_pfn it wouldn't cause boot failure on some system out
there in the wild.

> This patch smells like working around the memblock code being fragile
> rather than a real fix.
>
> That's *particularly* true when the very line above it did a
> "memblock_reserve()" of the exact same range that the memblock_add()
> "adds".

The most correct thing to do would have been to

memblock_add(0, end_of_first_memory_bank);

Somewhere at e820__memblock_setup().

But that would mean we also must change the way e820__memblock_setup()
reserves memory and that seemed to me like really asking for troubles so
I've limited the registration of memory to the range that's for sure
reserved.

A part of the problem is that x86 adds only usable memory to
memblock.memory omitting holes and reserved areas, while free_area_init()
presumes that memblock.memory covers populated physical memory.

I've tried implicitly adding ranges from memblock.reserved to
memblock.memory if they were not there and it had broken some arm machines:

https://lore.kernel.org/lkml/127999c4-7d56-0c36-7f88-8e1a5c934cae@xxxxxxxxxxxxx

I do feel that free_area_init() is fragile and no doubt there is a room for
improvement there. But I think the safer way forward is to reduce
inconsistencies between arch and generic code, so that we won't need to
guess what is the memory layout at free_area_init() time.

> Linus

--
Sincerely yours,
Mike.