Re: [PATCH 3/8] riscv: Add support for srmcfg CSR from Ssqosid extension
From: sashiko-bot
Date: Fri Jun 19 2026 - 14:40:49 EST
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Medium] The CPU PM exit notifier invalidates the software CSR cache but fails to eagerly restore the hardware CSR_SRMCFG, leaving the currently executing task running with incorrect QoS IDs during system resume.
- [Medium] The CPU hotplug teardown callback's cache invalidation is prematurely overwritten by the idle task context switch, leaving the hardware CSR out of sync when the CPU comes back online.
--
commit 083587978f6f45cace44a919e47c9276b0701827
Author: Drew Fustini <fustini@xxxxxxxxxx>
riscv: Add support for srmcfg CSR from Ssqosid extension
This commit adds support for the srmcfg CSR defined in the Ssqosid ISA
extension. It allows setting the RCID and MCID for each task on context
switch to associate resource usage with the running task, and caches the
CSR state per-cpu to avoid redundant writes.
Link: https://github.com/riscv/riscv-ssqosid/releases/tag/v1.0
> diff --git a/arch/riscv/kernel/qos.c b/arch/riscv/kernel/qos.c
> index 0000000000000..42f1ff9b219d9
> --- /dev/null
> +++ b/arch/riscv/kernel/qos.c
[ ... ]
> +static int riscv_srmcfg_pm_notify(struct notifier_block *nb,
> + unsigned long action, void *unused)
> +{
> + switch (action) {
> + case CPU_PM_EXIT:
> + case CPU_PM_ENTER_FAILED:
> + __this_cpu_write(cpu_srmcfg, U32_MAX);
> + break;
[Severity: Medium]
Will the currently executing task run with an incorrect hardware QoS ID during
resume?
When the system resumes from deep idle or suspend, the hardware CSR_SRMCFG
is reset to an implementation-defined state. While this invalidates the
software cache, it doesn't immediately reprogram the hardware CSR.
Because there is no immediate context switch, does this leave the currently
running task executing with the reset hardware state until it is eventually
scheduled out?
Should this explicitly call __switch_to_srmcfg(current) to eagerly restore
the hardware state?
> + }
> + return NOTIFY_OK;
> +}
[ ... ]
> +static int __init riscv_srmcfg_init(void)
> +{
> + int err;
> +
> + if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SSQOSID))
> + return 0;
> +
> + /*
> + * cpuhp_setup_state() invokes the startup callback locally on every
> + * already-online CPU, so no separate seed loop is needed here.
> + */
> + err = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "riscv/srmcfg:online",
> + riscv_srmcfg_reset_cache, riscv_srmcfg_reset_cache);
[Severity: Medium]
Is it possible the teardown callback's cache invalidation gets prematurely
overwritten by the idle task context switch?
The teardown callback runs at CPUHP_AP_ONLINE_DYN during CPU offline, which
sets the cache to U32_MAX. However, after this callback finishes, the dying
CPU must still context switch to the idle task to halt.
Will this context switch trigger __switch_to_srmcfg(idle_task), overwriting
the U32_MAX sentinel before the CPU powers off?
If so, when the CPU comes back online with a reset hardware CSR, and
schedules its first task, could it match the stale cache and skip the CSR
write, leaving the hardware CSR out of sync?
> + if (err < 0) {
> + pr_warn("srmcfg cpuhp registration failed (%d), cpus brought online after boot will not invalidate the CSR_SRMCFG cache\n",
> + err);
> + return err;
> + }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260619-dfustini-atl-sc-cbqri-dt-v1-0-e79a7723fab0@xxxxxxxxxx?part=3