Re: kexec reboot fails with extra wbinvd introduced for AME SME

From: Dave Young
Date: Wed Jan 17 2018 - 21:48:02 EST


On 01/17/18 at 06:14pm, Linus Torvalds wrote:
> On Wed, Jan 17, 2018 at 5:47 PM, Dave Young <dyoung@xxxxxxxxxx> wrote:
> >
> > It does not work with just once wbinvd(), and it only works with
> > removing the wbinvd() for me. Tom's new post works for me as well
> > since my cpu is an Intel i5-4200U.
>
> Intriguing.
>
> It's not like the wbinvd really should be that much of a deal.
>
> I think Tom's patch is fine and should be applied, but it does worry
> me a bit that even a single wbinvd makes that much of a difference for
> you. There is very little logical reason I can think of that a wbinvd
> should make any difference what-so-ever on an i5-4200U.
>
> I wonder if you have some system issues, and wbinvd just happens to
> trigger them. But I think we do wbinvd before a suspend-to-RAM too
> (it's "ACPI_FLUSH_CPU_CACHE()" in the ACPI code). And the dmr code
> dioes "wbinvd_on_all_cpus()" which does a cross-call etc.
>
> Would you mind experimenting a bit with that wbinvd?
>
> In particular, what happens if you enable it (so it's not hidden by
> the SME check), but you move it up to before interrupts are disabled?

Did several quick tests, probably need more tests, but till now the
results are:

void stop_this_cpu(void *dummy)
{
=====> add wbinvd here: kexec works
local_irq_disable();
=====> add wbinvd here: kexec works
/*
* Remove this CPU:
*/
set_cpu_online(smp_processor_id(), false);
=====> add wbinvd here: kexec does not work

disable_local_APIC();
mcheck_cpu_clear(this_cpu_ptr(&cpu_info));

[snip]

So it seems that it will not work after cpu offined..

>
> I'm wondering if there is some issue with MCE generation and wbinvd
> and whatever, and doing it when the CPU is down and interrupts are
> disabled causes some system issue..
>
> Does anybody have any other ideas?
>
> Linus