Re: [PATCH v3 0/3] SRF: Fix offline CPU preventing pc6 entry

From: Thomas Gleixner
Date: Tue Nov 12 2024 - 20:19:19 EST


On Tue, Nov 12 2024 at 16:43, Patryk Wlazlyn wrote:
>> There's a comment there that explains why this is done. If you don't
>> understand this, then please don't touch this code.
>
> /*
>  * Kexec is about to happen. Don't go back into mwait() as
>  * the kexec kernel might overwrite text and data including
>  * page tables and stack. So mwait() would resume when the
>  * monitor cache line is written to and then the CPU goes
>  * south due to overwritten text, page tables and stack.
>  *
>  * Note: This does _NOT_ protect against a stray MCE, NMI,
>  * SMI. They will resume execution at the instruction
>  * following the HLT instruction and run into the problem
>  * which this is trying to prevent.
>  */
>
> If you are referring to this comment above, I do understand the need to
> enter hlt loop before the kexec happens. I thought that I could bring
> all of the offlined CPUs back online, effectively getting them out of
> the mwait loop.

That's not really working:

1) Regular kexec offlines them again.

2) Kexec in panic can't do any of that.

Thanks

tglx