Re: [PATCH v3 3/6] x86/sev: Add support to perform RMP optimizations asynchronously
From: Kalra, Ashish
Date: Wed Apr 01 2026 - 12:09:11 EST
Hello Dave,
On 3/30/2026 7:46 PM, Kalra, Ashish wrote:
>
> On 3/30/2026 6:22 PM, Dave Hansen wrote:
>>
>>> static __init void configure_and_enable_rmpopt(void)
>>> {
>>> phys_addr_t pa_start = ALIGN_DOWN(PFN_PHYS(min_low_pfn), SZ_1G);
>>> @@ -499,6 +582,37 @@ static __init void configure_and_enable_rmpopt(void)
>>> */
>>> for_each_online_cpu(cpu)
>>> wrmsrq_on_cpu(cpu, MSR_AMD64_RMPOPT_BASE, rmpopt_base);
>>
>> What is the scope of MSR_AMD64_RMPOPT_BASE? Can you have it enabled on
>> one thread and not the other? Could they be different values both for
>> enabling and the rmpopt_base value?
>>
>> If it's not per-thread, then why is it being initialized for each thread?
>>
>
> Only one logical thread per core needs to set RMPOPT_BASE MSR as it is per-core,
> so i will use the "primary_threads_cpumask" here to use it for programming this
> MSR.
>
> Just another reason, to set the "primary_threads_cpumask" here in this function
> and then re-use it for the RMPOPT worker.
>
Coming back to this ...
For using the "primary_thread_cpumask" i will need to use something like
on_each_cpu_mask() similar to what i was doing in v2.
In v2, i was programming the RMPOPT_BASE MSR using on_each_cpu_mask(),
that required using a callback function to do the WRMSR:
+static void __configure_rmpopt(void *val)
+{
+ u64 rmpopt_base = ((u64)val & PUD_MASK) | MSR_AMD64_RMPOPT_ENABLE;
+
+ wrmsrq(MSR_AMD64_RMPOPT_BASE, rmpopt_base);
+}
+
+ on_each_cpu_mask(cpu_online_mask, __configure_rmpopt, (void *)pa_start, true);
But, that required using the (void *) casting, which you objected to and you
suggested the use of for_each_online_cpu() and wrmsrq_on_cpu(), and i has replied
that i need to do it (only) once on each thread per core, and that's why i may need
to use on_each_cpu_mask() and then you had suggested that if you *need* performance
then i can implement/add something like wrmsrq_on_cpumask().
For programming the RMPOPT_BASE MSR performance is not really that important as
it is for issuing the RMPOPT instruction on only thread per core, and as we are
programming the RMPOPT_BASE MSRs on all CPUs/threads to the same (starting) physical
address to support all RAM up-to 2TB for RMP optimizations, therefore, i don't
think it is that critical to implement wrmsrq_on_cpumask() and instead we can continue
to program the RMPOPT_BASE MSR on all CPUs (threads).
Thanks,
Ashish