Re: [PATCH v8 4/7] x86/sev: Add support to perform RMP optimizations asynchronously

From: Kalra, Ashish

Date: Tue Jun 16 2026 - 15:56:38 EST


Hello Prateek,

On 6/16/2026 2:27 AM, K Prateek Nayak wrote:
> Hello Ashish,
>
> On 6/16/2026 1:19 AM, Ashish Kalra wrote:
>> + /*
>> + * RMPOPT scans the RMP table, stores the result of the scan in the
>> + * reserved processor memory. The RMP scan is the most expensive
>> + * part. If a second RMPOPT occurs, it can skip the expensive scan
>> + * if they can see a cached result in the reserved processor memory.
>> + *
>> + * Do RMPOPT on one CPU alone. Then, follow that up with RMPOPT
>> + * on every other primary thread. Followers are "designed to"
>> + * skip the scan if they see the "cached" scan results.
>> + */
>> + cpumask_copy(follower_mask, &rmpopt_cpumask);
>
> rmpopt_cpumask is constructed after hotplug is disabled but ...
>
>> +
>> + /*
>> + * Pin the worker to the current CPU for the leader loop so that
>> + * this_cpu remains valid and the RMPOPT instruction executes on
>> + * the correct CPU.
>> + *
>> + * Use migrate_disable() rather than get_cpu() to prevent
>> + * migration while still allowing preemption.
>> + */
>> + migrate_disable();
>> + this_cpu = smp_processor_id();
>> +
>> + if (cpumask_test_cpu(this_cpu, follower_mask)) {
>> + /*
>> + * Current CPU is a primary thread in rmpopt_cpumask.
>> + * Run leader locally and remove from follower mask.
>> + */
>> + cpumask_clear_cpu(this_cpu, follower_mask);
>> +
>> + for (pa = rmpopt_pa_start; pa < rmpopt_pa_end; pa += SZ_1G) {
>> + rmpopt(pa);
>> + cond_resched();
>> + }
>> + } else if (cpumask_intersects(topology_sibling_cpumask(this_cpu),
>> + follower_mask)) {
>> + /*
>> + * Current CPU is a sibling thread whose primary is in
>> + * rmpopt_cpumask. RMPOPT_BASE MSR is per-core, so it
>> + * is safe to run the leader locally. Remove the sibling's
>> + * primary from the follower mask as this core is already
>> + * covered by the leader.
>> + */
>> + cpumask_andnot(follower_mask, follower_mask,
>> + topology_sibling_cpumask(this_cpu));
>> +
>> + for (pa = rmpopt_pa_start; pa < rmpopt_pa_end; pa += SZ_1G) {
>> + rmpopt(pa);
>> + cond_resched();
>> + }
>> + } else {
>> + /*
>> + * Current CPU does not have RMPOPT_BASE MSR programmed.
>> + * Pick an explicit leader from the cpumask to avoid #UD.
>> + * Use work_on_cpu() to run in process context on the leader,
>> + * avoiding IPI latency.
>> + */
>
> ... this_cpu is neither in the "rmpopt_cpumask", nor is any of its
> siblings on "rmpopt_cpumask".
>
> How does that happen?

Actually, this was the implementation before the CPU hotplug disable enforcement code was implemented and added in v8,
and i should have fixed this rmpopt_work_handler() accordingly for v8.

With the enforced cpu hotplug disable support, case #3 here (above) is now dead code, and removing it lets
cases #1 and #2 collapse too.

snp_prepare() requires cpu_online_mask == cpu_present_mask before SNP init — so when snp_setup_rmpopt() programs the MSRs, every
core's primary is online -> every core is in rmpopt_cpumask.

So now the work handler always runs on a CPU whose core is programmed. topology_sibling_cpumask(this_cpu) therefore always intersects
rmpopt_cpumask -> case #1 or #2 always matches.

So i should actually drop case #3 here - which is: "this_cpu is neither in the "rmpopt_cpumask", nor is any of its
siblings on rmpopt_cpumask"


>
>> + int leader_cpu = cpumask_first(follower_mask);
>> +
>> + if (WARN_ON_ONCE(leader_cpu >= nr_cpu_ids)) {
>> + migrate_enable();
>> + goto out;
>> + }
>> +
>> + cpumask_clear_cpu(leader_cpu, follower_mask);
>> +
>> + /* Release migration pin before work_on_cpu(). */
>> + migrate_enable();
>> +
>> + work_on_cpu(leader_cpu, rmpopt_leader_fn, NULL);
>
> This creates a delayed work and also waits for it to finish execution
> which will add more latency than a simple IPI if the comment about IPI
> latency above is accurate.
>
> I think there is some corner case in construction of the
> "rmpopt_cpumask" that requires this not-so-pretty else block. Can you
> elaborate why this is required?
>
> Perhaps the "rmpopt_cpumask" construction needs:
>
> for_each_online_cpu(cpu) {
> /* Nominate the first CPU on the sibling mask for RMPOPT */
> if (cpu != cpumask_first(topology_sibling_cpumask(cpu)))
> continue;
> cpumask_set_cpu(cpu, &rmpopt_cpumask);
> }
>
>
> and all you need here is:
>
> /* Do RMPOPt for local core */
> for (pa = rmpopt_pa_start; pa < rmpopt_pa_end; pa += SZ_1G)
> rmpopt(pa);
>
> /* Skip this core from concurrent RMPOPT */
> cpumask_and_not(follower_mask, &rmpopt_cpumask, topology_sibling_cpumask(cpu));
>
> No?
>

Yes, a simpler implementation will be like this:
...

if (!alloc_cpumask_var(&follower_mask, GFP_KERNEL))
return;

cpumask_copy(follower_mask, &rmpopt_cpumask);

/*
* The current CPU's core always has RMPOPT_BASE programmed
* (snp_prepare() required all CPUs online at setup and CPU hotplug
* is disabled while SNP is active), so it can always be the leader.
* RMPOPT_BASE is per-core; exclude this core from the followers.
*/
migrate_disable();
cpumask_andnot(follower_mask, follower_mask,
topology_sibling_cpumask(smp_processor_id()));

for (pa = rmpopt_pa_start; pa < rmpopt_pa_end; pa += SZ_1G) {
rmpopt(pa);
cond_resched();
}
migrate_enable();

cpus_read_lock();
for (pa = rmpopt_pa_start; pa < rmpopt_pa_end; pa += SZ_1G) {
on_each_cpu_mask(follower_mask, rmpopt_smp, (void *)pa, true);
cond_resched();
}
cpus_read_unlock();

free_cpumask_var(follower_mask);


Here, the leader exclusion must use the sibling mask, not clear_cpu(this_cpu). That's why my collapsed version uses:

cpumask_andnot(follower_mask, follower_mask,
topology_sibling_cpumask(smp_processor_id()));

- If this_cpu is a primary: its sibling mask contains itself (the primary) -> andnot removes this core's primary from the followers.

- If this_cpu is a secondary: it isn't in follower_mask at all, but its sibling mask contains its primary, which is in
follower_mask -> andnot still removes this core's primary.

So either way the current core is dropped from the followers. (The old code needed two branches because case #1 used
clear_cpu(this_cpu) — only correct when this_cpu is the primary — while case #2 used the sibling andnot. The single andnot works for
both cases).

Thanks,
Ashish

>> + goto followers;
>> + }
>> +
>> + migrate_enable();
>> +