Re: [PATCH v9 3/6] x86/sev: Disable CPU hotplug while SNP is active
From: K Prateek Nayak
Date: Thu Jun 25 2026 - 18:16:58 EST
Hello Ashish,
On 6/26/2026 1:12 AM, Kalra, Ashish wrote:
> Hello Boris,
>
> On 6/25/2026 10:02 AM, Borislav Petkov wrote:
>> On Wed, Jun 24, 2026 at 09:56:49PM +0000, Ashish Kalra wrote:
>>> +/* Set while SNP has CPU hotplug disabled (kernel-lifetime; survives ccp reload). */
>>> +static bool snp_cpu_hotplug_disabled;
>>
>> Do you really need this?
>>
>
> Yes.
>
> cpu_hotplug_disable()/cpu_hotplug_enable() are refcounted (cpu_hotplug_disabled++/--,
> with a WARN on underflow), so they have to be balanced. This flag collapses them to
> exactly one outstanding disable per SNP-active window, because the disable and enable
> sites are not reached a symmetric number of times:
> > - On firmware without SNP_X86_SHUTDOWN_SUPPORTED, __sev_snp_shutdown_locked() does not
> call snp_shutdown() (it's gated on data.x86_snp_shutdown), so SNP stays enabled in
> hardware — SNP_EN stays set and hotplug stays disabled — while sev->snp_initialized is
> cleared. Re-init after that is routine, the SNP ioctls self-bracket init and shutdown
> (e.g. SNP_COMMIT, SNP_SET_CONFIG, SNP_VLEK_LOAD):
>
> if (!sev->snp_initialized)
> snp_move_to_init_state(...); /* -> __sev_snp_init_locked -> snp_prepare() */
> ... SNP_CMD ...
> if (shutdown_required)
> __sev_snp_shutdown_locked(...);
> - So whenever SNP isn't already initialized (psp_init_on_probe off, or after a prior
> legacy shutdown), every such ioctl does init -> command -> legacy shutdown. Each init
> reaches snp_prepare() with SNP_EN already set, and the disable now sits at the top of
> snp_prepare(), so it fires on every cycle. Without this flag that keeps bumping
> cpu_hotplug_disabled while the legacy shutdown never re-enables — hotplug ends up stuck
> disabled. This flag makes all but the first disable a no-op.
>
> - Also, importantly, kvm-amd module reload on legacy firmware is the same pattern:
> unload leaves SNP_EN set, reload re-inits.)
Looking at snp_prepare(), we have an early-bailout for
rdmsrq(MSR_AMD64_SYSCFG, val);
if (val & MSR_AMD64_SYSCFG_SNP_EN)
return;
Does executing SHUTDOWN command lead to the firmware clearing SNP_EN in
SYSCFG on all CPUS?
If SNP_EN remains set (and Linux can't clear it since it is
"Write-1-only" bit), then a subsequent snp_prepare() will skip setting
SYSCFG if it sees SNP_EN on local CPU.
It can so happen that we enable hotlpug at shutdown, CPUs come online
without setting SNP_EN in SYSCFG, subsequent snp_prepare() runs on a CPU
where SNP_EN is still set and skips configuring it for the CPUs that
don't have it set, and we'll be in a pickle still.
The comment above that bailout saying "this can happen in case of kexec
boot" makes me believe that SNP_EN remains set until a full system
reset.
The only safe way to do this is to ensure all possible CPUs are online
during snp_prepare() and do snp_enable() regardless of whether local CPU
has SNP_EN or not.
Am I missing something?
>
> - On the enable side it avoids an unbalanced cpu_hotplug_enable() when the teardown/failure
> paths run without an outstanding disable (e.g. shutdown of a never-fully-initialized SNP).
>
> So it's not redundant with cpu_hotplug_disabled — it tracks whether the outstanding disable
> belongs to this SNP-active window in this kernel, which keeps the single disable/enable
> balanced across the asymmetric legacy-vs-full SNP teardown paths and re-init.
--
Thanks and Regards,
Prateek