Re: [PATCH v9 3/6] x86/sev: Disable CPU hotplug while SNP is active

From: Borislav Petkov

Date: Fri Jun 26 2026 - 12:41:49 EST


On Thu, Jun 25, 2026 at 02:42:23PM -0500, Kalra, Ashish wrote:
> Hello Boris,

Hello Ashish,

lemme try to make sense of your AI reply...

> cpu_hotplug_disable()/cpu_hotplug_enable() are refcounted (cpu_hotplug_disabled++/--,
> with a WARN on underflow), so they have to be balanced. This flag collapses them to
> exactly one outstanding disable per SNP-active window, because the disable and enable
> sites are not reached a symmetric number of times:

Well, why aren't they?

Why isn't a simple design where on SNP init hotplug is disabled - *exactly*
one call to cpu_hotplug_disable() and on SNP shutdown hotplug is reenabled
again - also exactly one call.

I know why...

> - On firmware without SNP_X86_SHUTDOWN_SUPPORTED, __sev_snp_shutdown_locked() does not

This function is one convoluted mess which does gazillion things. If I were
maintaining that code, I would impose a mandatory cleanup phase before new
features are added. But I probably said that already before...

And because a lot of code from your set goes into areas I maintain, I would
suggest you take the time and do that cleanup. Before that code goes
completely off the rails. And I'm willing to offer you review bandwidth and
other help I can with doing this right.

> call snp_shutdown() (it's gated on data.x86_snp_shutdown), so SNP stays enabled in
> hardware — SNP_EN stays set and hotplug stays disabled — while sev->snp_initialized is
> cleared. Re-init after that is routine, the SNP ioctls self-bracket init and shutdown
> (e.g. SNP_COMMIT, SNP_SET_CONFIG, SNP_VLEK_LOAD):

That init and teardown flow should be simplified:

You have multiple things which you need to do at different times

- per-CPU init
- global init

- per-CPU teardown
- global teardown

CPU hotplug toggling belongs to the global category. Instead of piling more
stuff onto that __sev_snp_shutdown_locked() function, you should take some
time to clean it up, analyze what goes where and then simplify that flow.

So let's clean stuff up first, please, analyze the flow and determine what
goes where and then do it. Not bolt more stuff on what is already wobbly.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette