Re: kexec, x86: Need a new e820 type support for kexec

From: Baoquan He
Date: Tue Aug 18 2015 - 04:35:15 EST


Hi Toshi,

Sorry for replying late.

On 08/06/15 at 07:13pm, Toshi Kani wrote:
> On Thu, 2015-08-06 at 16:12 +0800, Baoquan He wrote:
> > Hi Toshi,
> >
> > Does this patch work for you?
>
> Hi Baoquan,
>
> I have tested the patch with both E820_PMEM and E820_PRAM setups, and
> confirmed it works fine for both cases. :-) I did multiple kexec reboots
> followed by a kdump in my testing. So, please feel free to add:
>
> Tested-by: Toshi Kani <toshi.kani@xxxxxx>

Thanks for testing, I will repost with Tested-by info.

>
> > There are things I am not sure. When jump to kexec/kdump kernel is this
> > PMEM still needed by system?
>
> Yes, after a kexec reboot, the kernel needs to be able to use NVDIMM as
> before. While the kernel actually uses NFIT table, not e820, the range
> should be marked as PMEM for consistency. The same goes to kdump kernel
> since NVDIMM may be used as a dump device in future.
>
> > And what's the difference between PRAM and
> > PMEM? I saw in kernel commit ec776ef6 it introduced E820_PRAM for the
> > non-standard protected e820 type, then in kernel commit ad5fb870 it
> > introduced E820_PMEM for ACPI 6.0 persistent memory types. While it
> > doesn't add complete support for E820_PMEM like E820_PRAM if I
> > understand it correctly.
>
> ACPI 6.0 spec defines E820_PMEM, which is used for NVDIMM devices from now
> on. ACPI 6.0 also defines NFIT table for NVDIMM along with this type.
>
> Before these are defined in ACPI, E820_PRAM type was "unofficially" used by
> some NVDIMM devices. So, E820_PRAM was added for such legacy NVDIMMs.
> Since the E820_PRAM case is very simple (it does not have any other FW
> tables), it can be easily emulated with the "memmap=nn!ss" option. So,
> people may use the memmap option to emulate this legacy NVDIMM.

I was wrong. In fact in kexec-tools memory info can be passed to kdump
kernel by 2 ways. One is using memmap by specifying
--pass-memmap-cmdline. The other one is storing memory regions in
e820_map of real mode data structure by default. And the 1st way is
rarely used. So no need to worry about the "memmap=nn!ss" option.

Since kernel parse_memmap_one doesn't support E820_PMEM well, I would
like to ignore the PMEM adding in memmap way. So this patch is enough.

>
> > In this patch I simply pass E820_PMEM to kdump
> > kernel as E820_PRAM when it emerges since kernel can parse E820_PRAM
> > only in parse_memmap_one(), otherwise E820_PMEM has to be discarded or
> > need be passed as E820_RESERVED. What do you think about this, need
> > E820_PMEM be differentiated with E820_PRAM strictly? If yes, I think a
> > kernel patch need be posted to fix this. If not, this patch is enough
> > for supporting both of them in kexec.
>
> E820_PMEM cannot be emulated by the "memmap=" option. Do you have to use
> the "memmap=" options to pass the ranges for kdump kernel? If so, I'd
> rather ignore E820_PMEM and let it be passed as E820_RESERVED. The kdump
> kernel can still obtain the info from NFIT if necessary.
>
> As for the code change...
>
> > @@ -640,6 +644,8 @@ static void cmdline_add_memmap_internal(char *cmdline,
> > unsigned long startk,
> > strcat (str_mmap, "K$");
> > else if (type == RANGE_ACPI || type == RANGE_ACPI_NVS)
> > strcat (str_mmap, "K#");
> > + else if (type == RANGE_PMEM || type == RANGE_PRAM)
> > + strcat (str_mmap, "K!");
>
> It should only check with RANGE_PRAM, but I do not think this change matters
> much unless you also modify the caller cmdline_add_memmap(), which has the
> following check to skip other types. I do not think we will use legacy
> NVDIMM device as a dump device, so you may ignore RANGE_PRAM and let it be
> passed as RESERVED as well (which is likely the case I tested with).
>
> /* Only adding memory regions of RAM and ACPI */
> if (type != RANGE_RAM &&
> type != RANGE_ACPI &&
> type != RANGE_ACPI_NVS)
> continue;

Then if ignore PMEM adding into memmap, cmdline_add_memmap need not be
cared any more.

Thanks
Baoquan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/