Re: [PATCH 1/2] mm/memblock: Add "reserve_mem" to reserved named memory at boot up

From: Steven Rostedt
Date: Tue Jun 04 2024 - 07:08:38 EST


On Tue, 4 Jun 2024 08:03:54 +0200
Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:

> On Tue, 4 Jun 2024 at 01:35, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> >
> > From: "Steven Rostedt (Google)" <rostedt@xxxxxxxxxxx>
> >
> > In order to allow for requesting a memory region that can be used for
> > things like pstore on multiple machines where the memory layout is not the
> > same, add a new option to the kernel command line called "reserve_mem".
> >
> > The format is: reserve_mem=nn:align:name
> >
> > Where it will find nn amount of memory at the given alignment of align.
> > The name field is to allow another subsystem to retrieve where the memory
> > was found. For example:
> >
> > reserve_mem=12M:4096:oops ramoops.mem_name=oops
> >
> > Where ramoops.mem_name will tell ramoops that memory was reserved for it
> > via the reserve_mem option and it can find it by calling:
> >
> > if (reserve_mem_find_by_name("oops", &start, &size)) {
> > // start holds the start address and size holds the size given
> >
> > Link: https://lore.kernel.org/all/ZjJVnZUX3NZiGW6q@xxxxxxxxxx/
> >
> > Suggested-by: Mike Rapoport <rppt@xxxxxxxxxx>
> > Signed-off-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx>
>
> You failed to point out in the commit message that the assumption here
> is that this memory will retain its contents across a soft reboot. Or
> am I misunderstanding this?

Yes that is the intention. I should update the commit message.

>
> In any case, as I pointed out before, playing these games unilaterally
> from the kernel side, i.e., without any awareness whatsoever from the
> firmware and bootloader (which will not attempt to preserve RAM
> contents), is likely to have a rather disappointing success ratio in
> the general case. I understand this may be different for vertically
> integrated software stacks like ChromeOS so perhaps it should live
> there as a feature.

I have been using this on two different test machines, as well as a
chromebook, and it appears to work on all ofthem. As well as for VMs. I
plan on adding this to my workstation and server too (they use EFI).

>
> Then, as Kees points out, there is also the risk that the kernel
> itself may be stepping on this memory before having realized that it
> is reserved. At least ARM and x86 have decompressors with a
> substantial amount of non-trivial placement logic that would need to
> be made aware of this reservation. Note that EFI vs. non-EFI boot also
> makes a difference here.

Agreed. Note, it should definitely state that this is not 100% reliable,
and depending on the setup it may not be reliable at all. Whatever uses it
should add something to confirm that the memory is the same.

If corner cases become an issue, this could be extended to work with them.
We could update KASLR to be aware of this allocation. The documentation
update to kernel-parameters.txt on this usage should definitely stress that
this can be unreliable, and use should be tested to see if it works. And
also stress that if it does work, it may not work all the time. The best
usage for this is for statistical debugging. For instance, in our use case,
we have 1000s of crashes that we have no idea why. If this worked only 10%
of the time, the data retrieved from 100 of those crashes would be very
valuable.

-- Steve