Re: [PATCH v6 0/2] mm/memblock: Add "reserve_mem" to reserved named memory at boot up
From: Steven Rostedt
Date: Mon Jun 17 2024 - 17:19:10 EST
On Mon, 17 Jun 2024 23:01:12 +0200
Alexander Graf <graf@xxxxxxxxxx> wrote:
> > This could be an added feature, but it is very architecture specific,
> > and would likely need architecture specific updates.
>
>
> It definitely would be an added feature, yes. But one that allows you to
> ensure persistence a lot more safely :).
Sure.
>
> Thinking about it again: What if you run the allocation super early (see
> arch/x86/boot/compressed/kaslr.c:handle_mem_options())? If you stick to
> allocating only from top, you're effectively kernel version independent
> for your allocations because none of the kernel code ran yet and
> definitely KASLR independent because you're running deterministically
> before KASLR even gets allocated.
>
> > As this code relies on memblock_phys_alloc() being consistent, if
> > something gets allocated before it differently depending on where the
> > kernel is, it can also move the location. A plugin to UEFI would mean
> > that it would need to reserve the memory, and the code here will need
> > to know where it is. We could always make the function reserve_mem()
> > global and weak so that architectures can override it.
>
>
> Yes, the in-kernel UEFI loader (efi-stub) could simply populate a new
> type of memblock with the respective reservations and you later call
> memblock_find_in_range_node() instead of memblock_phys_alloc() to pass
> in flags that you want to allocate only from the new
> MEMBLOCK_RESERVE_MEM type. The same model would work for BIOS boots
> through the handle_mem_options() path above. In fact, if the BIOS way
> works fine, we don't even need UEFI variables: The same way allocations
> will be identical during BIOS execution, they should stay identical
> across UEFI launches.
>
> As cherry on top, kexec also works seamlessly with the special memblock
> approach because kexec (at least on x86) hands memblocks as is to the
> next kernel. So the new kernel will also automatically use the same
> ranges for its allocations.
I'm all for expanding this. But I would just want to get this in for
now as is. It theoretically works on all architectures. If someone
wants to make in more robust and accurate on a specific architecture,
I'm all for it. Like I said, we could make the reserver_mem() function
global and weak, and then if an architecture has a better way to handle
this, it could use that.
Hmm, x86 could do this with the e820 code like I did in my first
versions. Like I said, it didn't fail at all with that.
And we can have an UEFI version as well.
-- Steve