RE: [PATCH] x86/e820: update code comment about e820_table_kexec
From: Chen, Yu C
Date: Mon Jan 13 2025 - 22:26:40 EST
Hi Dave,
Thanks for bringing this up,
> -----Original Message-----
> From: Dave Young <dyoung@xxxxxxxxxx>
> Sent: Friday, January 10, 2025 2:30 PM
> To: Chen, Yu C <yu.c.chen@xxxxxxxxx>
> Cc: x86@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; kexec@xxxxxxxxxxxxxxxxxxx;
> Ingo Molnar <mingo@xxxxxxxxxx>; Borislav Petkov <bp@xxxxxxxxx>
> Subject: Re: [PATCH] x86/e820: update code comment about
> e820_table_kexec
>
> > BTW, the conversation below drived me to read the e820 code:
> > https://lore.kernel.org/all/CAMj1kXG1hbiafKRyC5qM1Vj5X7x-
> dmLndqqo2AYnH
> > MRxDz-80w@xxxxxxxxxxxxxx/T/#u
> >
> > It could be better to clean up the e820 tables, anyway the comment fix
> > in this patch itself is good for now.
> >
> > Basically e820_table_firmware is used by kexec-tools kexec_load
> > implementation, e820_table_kexec is used by kexec_file_load code to
> > pass to the 2nd kernel in boot params.
> >
> > The e820_table_firmware is said to not be modified by the kernel in
> > the code comment, but this is not true, at least the sev code updates
> > the table. The hibernate code generates crc32 checksum and verifies
> > it, but since AFAIK e820 table update only happens in boot phase, it
> > will be stable on runtime. So we can just use e820_table_firmware for
> > kexec use and drop the e820_table_kexec. With the change the
> > kexec_file_load and kexec_load see the same memory ranges.
> >
> > Otherwise I thought we can use just one e820 table, dropping
> > e820_table_firmware and e820_table_kexec, but then there will be
> > fragments and memory waste due to the setup_data ranges are reserved
> > and updated in e820_table, so the e820_table_firmware being still be
> > used for kexec makes sense.
> >
> > Anyway I need to think more about it, please let me know if you have
> > any concerns.
>
> Reread the old commit 12df216c61c89e31e27e74146115a9728880ca6f
> x86/boot/e820: Introduce the bootloader provided e820_table_firmware[]
> table
>
> It seems the e820_table_firmware was changed to be exported in sysfs
> instead of e820_table_kexec, but I suspect this is wrong.
> kexec-tools (kexec_load) use the exported sysfs memmap info, but
> e820_table_firmware was created by above commit to be used by hibernation
> and the table should not be changed, the fact is there are changes happen
> from time to time.
>
This is not expected from the original introducing of e820_table_firmware, it
should not be changed by OS. I suppose the changes to e820_table_firmware
are because of the kexec requirements for /sys/firmware/memmap?
Another question is that, why does kexec_load() get the memory layout
from /sys/firmware/memmap, but kexec_file_load() relies on the in-kernel
e820_table_kexec?
> Question is does hibernation use the sysfs entries from its userspace tools?
Hibernation does not use this sysfs entries in userspace(or uswsusp )as far
as I know.
> If yes, then we should have both kexec table and firmware table exported
> because they are for different purposes and one may change, another not.
>
> If hibernation only uses the table within the kernel then it makes no sense to
> export it to sysfs, we should only export the kexec table for kexec-tools use.
> In this way both kexec_load (userspace) and kexec_file_load (kernel load) can
> use same e820 table, it will reduce the bugs and be easier to maintain.
I'm OK with not exposing e820_table_firmware to /sys/firmware/memmap, if
kexec is the only user of /sys/firmware/memmap.
Thanks,
Chenyu