Re: [PATCH 5/5] kexec: X86: Pass memory ranges via e820 tableinstead of memmap= boot parameter

From: Vivek Goyal
Date: Fri Apr 12 2013 - 10:32:01 EST

On Thu, Apr 11, 2013 at 08:06:50AM -0700, H. Peter Anvin wrote:
> On 04/11/2013 07:55 AM, Yinghai Lu wrote:
> > On Thu, Apr 11, 2013 at 5:26 AM, Thomas Renninger <trenn@xxxxxxx> wrote:
> >> Currently ranges are passed via kernel boot parameters:
> >> memmap=exactmap memmap=X#Y memmap=
> >>
> >> Pass them via e820 table directly instead.
> >
> > how to address "saved_max_pfn" referring in kernel?
> >
> > kernel need to use saved_max_pfn from old e820 in
> > drivers/char/mem.c::read_oldmem()
> >
> > mips and powerpc they are passing that from command line "savemaxmem="
> >
> > x86 should use that too?
> >
> Oh bloody hell, yet another f-ing "max_pfn" variable.
> The *only* one that makes any kind of sense is max_low_pfn (marking the
> cutoff to highmem)... the pretty much the rest of them are just plain wrong.
> And I don't mean "mildly annoying", I mean "catastrophically wrong
> semantics". In this case, it introduces a completely arbitrary
> distinction between a nonmemory range below a high water mark and a
> nonmemory range above that high water mark. In fact, from reading the
> code it seems pretty clear that the device will blindly assume that
> anything below saved_max_pfn is memory and will try to map it
> cachable... which will #MC on quite a few machines.
> This kind of crap HAS TO STOP. Memory is discontiguous, deal with it
> and deal with it properly.

Agreed. saved_max_pfn is bad idea. Passing all the mappable memory of
old kernel as "RESERVED" (Or KDUMP_RESERVED or KDUMP_MEM or whatever) to
next kernel in e820 map sounds better. And next kernel can allow access
to RESERVED range using /dev/oldmem interface.

For backward compatibility with old kexec-tools we can probably retain
saved_max_pfn for some time. We can set saved_max_pfn to end of
memory range including "RESERVED" regions. And this will be overwritten
if old kexec-tools have passed this parameter on command line. Also
whenever user passes saved_max_pfn on command line, we can do WARN_ONCE()
to upgrade to kexec-tools and let them know that saved_max_pfn will be

For issue of doing ioremap() on everything as cacheable, we should be
able to modify copy_olmem_page() and it should go through memory map
and check whether said pfn is mappable or not and what flags should
be used to map it.

I think this will again be problem with old kexec-tools. May be we check
of presence of atleast one "KDUMP_RESERVED" range in memory map. If none
is present, we know old kexec-tools were used and in that we can map
all pfn ioremap() blindly. We can do WARN_ONCE() and ask user to upgrade
the kexec-tools and after some time do away with this hack in
copy_oldmem_page() as well as remove saved_max_pfn.
> I also have to admit that I don't see the difference between /dev/mem
> and /dev/oldmem, as the former allows access to memory ranges outside
> the ones used by the current kernel, which is what the oldmem device
> seems to be intended to od.

I think one difference seems to be that /dev/mem assumes that validly
accessed memory is already mapped in kernel while /dev/oldmeme assumes
it is not mapped and creates temporary mappings explicitly.

