Re: [PATCH 2/2] x86/numa: instance all parsed numa node
From: Thomas Gleixner
Date: Tue Jul 09 2019 - 02:13:45 EST
On Tue, 9 Jul 2019, Pingfan Liu wrote:
> On Mon, Jul 8, 2019 at 5:35 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> > It can and it does.
> >
> > That's the whole point why we bring up all CPUs in the 'nosmt' case and
> > shut the siblings down again after setting CR4.MCE. Actually that's in fact
> > a 'let's hope no MCE hits before that happened' approach, but that's all we
> > can do.
> >
> > If we don't do that then the MCE broadcast can hit a CPU which has some
> > firmware initialized state. The result can be a full system lockup, triple
> > fault etc.
> >
> > So when the MCE hits a CPU which is still in the crashed kernel lala state,
> > then all hell breaks lose.
> Thank you for the comprehensive explain. With your guide, now, I have
> a full understanding of the issue.
>
> But when I tried to add something to enable CR4.MCE in
> crash_nmi_callback(), I realized that it is undo-able in some case (if
> crashed, we will not ask an offline smt cpu to online), also it is
> needless. "kexec -l/-p" takes the advantage of the cpu state in the
> first kernel, where all logical cpu has CR4.MCE=1.
>
> So kexec is exempt from this bug if the first kernel already do it.
No. If the MCE broadcast is handled by a CPU which is stuck in the old
kernel stop loop, then it will execute on the old kernel and eventually run
into the memory corruption which crashed the old one.
Thanks,
tglx