[PATCH RFC 2/7] x86/microcode: Introduce staging option to reduce late-loading latency
From: Chang S. Bae
Date: Tue Oct 01 2024 - 12:19:46 EST
As microcode patch sizes continue to grow, the latency during
late-loading can spike, leading to timeouts and interruptions in running
workloads. This trend of increasing patch sizes is expected to continue,
so a foundational solution is needed to address the issue.
To mitigate the problem, a new staging feature is introduced. This option
processes most of the microcode update (excluding activation) on a
non-critical path, allowing CPUs to remain operational during the
majority of the update. By moving most of the work off the critical path,
the latency spike can be significantly reduced.
Integrate the staging process as an additional step in the late-loading
flow. Introduce a new callback for staging, which is invoked after the
microcode patch image is prepared but before entering the CPU rendezvous
for triggering the update.
Staging follows an opportunistic model: it is attempted when available.
If successful, it reduces CPU rendezvous time; if not, the process falls
back to the legacy loading, potentially exposing the system to higher
latency.
Extend struct microcode_ops to incorporate staging properties, which will
be updated in the vendor code from subsequent patches.
Signed-off-by: Chang S. Bae <chang.seok.bae@xxxxxxxxx>
---
Whether staging should be mandatory is a policy decision that is beyond
the scope of this patch at the moment. For now, the focus is on
establishing a basic flow, with the intention of attracting focused
reviews, while deferring the discussion on staging policy later.
In terms of the flow, an alternative approach could be to integrate
staging as part of microcode preparation on the vendor code side.
However, this was deemed too implicit, as staging involves loading and
validating the microcode image, which differs from typical microcode file
handling.
---
arch/x86/kernel/cpu/microcode/core.c | 12 ++++++++++--
arch/x86/kernel/cpu/microcode/internal.h | 4 +++-
2 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index b3658d11e7b6..4ddb5ba42f3f 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -676,19 +676,27 @@ static bool setup_cpus(void)
static int load_late_locked(void)
{
+ bool is_safe = false;
+
if (!setup_cpus())
return -EBUSY;
switch (microcode_ops->request_microcode_fw(0, µcode_pdev->dev)) {
case UCODE_NEW:
- return load_late_stop_cpus(false);
+ break;
case UCODE_NEW_SAFE:
- return load_late_stop_cpus(true);
+ is_safe = true;
+ break;
case UCODE_NFOUND:
return -ENOENT;
default:
return -EBADFD;
}
+
+ if (microcode_ops->use_staging)
+ microcode_ops->staging_microcode();
+
+ return load_late_stop_cpus(is_safe);
}
static ssize_t reload_store(struct device *dev,
diff --git a/arch/x86/kernel/cpu/microcode/internal.h b/arch/x86/kernel/cpu/microcode/internal.h
index 21776c529fa9..cb58e83e4934 100644
--- a/arch/x86/kernel/cpu/microcode/internal.h
+++ b/arch/x86/kernel/cpu/microcode/internal.h
@@ -31,10 +31,12 @@ struct microcode_ops {
* See also the "Synchronization" section in microcode_core.c.
*/
enum ucode_state (*apply_microcode)(int cpu);
+ void (*staging_microcode)(void);
int (*collect_cpu_info)(int cpu, struct cpu_signature *csig);
void (*finalize_late_load)(int result);
unsigned int nmi_safe : 1,
- use_nmi : 1;
+ use_nmi : 1,
+ use_staging : 1;
};
struct early_load_data {
--
2.43.0