Re: [PATCH] x86/kdump: Fix 'kmem -s' reported an invalid freepointer when SME was active

From: Baoquan He
Date: Fri Sep 27 2019 - 19:56:00 EST


On 09/27/19 at 03:49pm, Eric W. Biederman wrote:
> >> In order to avoid such problem, lets occupy the first 640k region when
> >> SME is active, which will ensure that the allocated memory does not fall
> >> into the first 640k area. So, no need to worry about whether kernel can
> >> correctly copy the contents of the first 640K area to a backup region in
> >> purgatory().
>
> We must occupy part of the first 640k so that we can start up secondary
> cpus unless someone has added another way to do that in recent years on
> SME capable cpus.
>
> Further there is Fimware/BIOS interaction that happens within those
> first 640K.
>
> Furthermore the kdump kernel needs to be able to read all of the memory
> that the previous kernel could read. Otherwise we can't get a crash
> dump.
>
> So I do not think ignoring the first 640K is the correct resolution
> here.

We discussed and tried many ways to copy the first 640K of 1st kernel
out since kernel data may be allocated there, then crash need it to
parse. But SME makes the copy very difficult to do, because the first
640K is encrypted in 1st kernel, but the copy is done in purgatory with
1:1 ident-mapping and unencrypted.

Finally we decided this way as patch does. Reserving it in memblock is
not ignoring the first 640K, but lock it down to avoid any later kernel
data allocated in this area. The first 640K will be taken as system RAM
of kdump kernel as is. Like this, no available kernel information
could be located in this area, then we don't care if the copy is correct
or not.

One word, what the patch is doing is locking down the first 640K after
reserve_real_mode() invocation. Putting it after reserve_real_mode() is
because reserve_real_mode() may put real mode trampoline inside first
640K. Surely the real mode trampoline will be discarded too in kdump
kernel, since it's not important and unnecessary for crash parsing.

I think Lianbo need rewrite this patch log to make it clearer.


diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 77ea96b794bd..5bfb2c83bb6c 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1148,6 +1148,9 @@ void __init setup_arch(char **cmdline_p)

reserve_real_mode();

+ if (sme_active())
+ memblock_reserve(0, 640*1024);
+
trim_platform_memory_ranges();
trim_low_memory_range();


>
> > The log is too simple, I know you did some other tries to fix this, but
> > the patch log does not show why you can not correctly copy the 640k in
> > current kdump code, in purgatory here.
> >
> > Also this patch seems works in your test, but still to see if other
> > people can comment and see if it is safe or not, if any other risks
> > other than waste the small chunk of memory. If it is safe then kdump
> > can just drop the backup logic and use this in common code instead of
> > only do it for SME.
>
> Exactly.
>
> I think at best this avoids the symptoms, but does not give a reliable
> crash dump.
>
> Eric