Re: [PATCH v2 3/8] riscv: Add support for srmcfg CSR from Ssqosid extension
From: Drew Fustini
Date: Thu Jun 25 2026 - 20:56:09 EST
On Thu, Jun 25, 2026 at 01:50:06AM +0000, sashiko-bot@xxxxxxxxxx wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> - [High] CPU hotplug cache invalidation is registered at the wrong state (CPUHP_AP_ONLINE_DYN), allowing the srmcfg cache sentinel to be overwritten before the CPU goes offline.
> --
>
> commit d51eb8c4a6c5a57f238cc53688229c08ea6418e1
> Author: Drew Fustini <fustini@xxxxxxxxxx>
>
> riscv: Add support for srmcfg CSR from Ssqosid extension
>
> This commit adds support for the srmcfg CSR defined in the Ssqosid ISA
> extension. It introduces a per-cpu cache to avoid redundant CSR writes
> during context switches and uses CPU hotplug and PM notifiers to manage
> cache invalidation.
>
> Link: https://github.com/riscv/riscv-ssqosid/releases/tag/v1.0
>
> > diff --git a/arch/riscv/kernel/qos.c b/arch/riscv/kernel/qos.c
> > new file mode 100644
> > index 0000000000000..ea33201a43f61
> > --- /dev/null
> > +++ b/arch/riscv/kernel/qos.c
>
> [ ... ]
>
> > +static int riscv_srmcfg_reset_cache(unsigned int cpu)
> > +{
> > + per_cpu(cpu_srmcfg, cpu) = U32_MAX;
> > + return 0;
> > +}
>
> [ ... ]
>
> > +static int __init riscv_srmcfg_init(void)
> > +{
> > + int err;
> > +
> > + if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SSQOSID))
> > + return 0;
> > +
> > + /*
> > + * cpuhp_setup_state() invokes the startup callback locally on every
> > + * already-online CPU, so no separate seed loop is needed here.
> > + */
> > + err = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "riscv/srmcfg:online",
> > + riscv_srmcfg_reset_cache, riscv_srmcfg_reset_cache);
>
> [Severity: High]
> Does registering the hotplug callbacks at CPUHP_AP_ONLINE_DYN run too early
> during teardown and too late during startup?
>
> Since CPUHP_AP_ONLINE_DYN teardown runs while the CPU is still schedulable,
> could the U32_MAX sentinel set by riscv_srmcfg_reset_cache() be overwritten
> by subsequent context switches (such as switching to the idle thread) before
> the CPU actually goes offline?
That is correct that the U32_MAX sentinel can be overwritten by the idle
task or per-cpu kthreads that run while CPU is dying. This is not a
problem in practice as they have thread.srmcfg == 0 which results in
cpu_srmcfg_default.
I will update the riscv_srmcfg_reset_cache() comment to clarify that the
teardown callback is not relied on.
> If a CPU is offlined and then onlined, the stale cache would persist while
> the hardware CSR has been reset. Because the CPU becomes schedulable before
> the CPUHP_AP_ONLINE_DYN startup callback runs, a task matching the stale
> cache might skip the required CSR write and run with incorrect hardware tags.
A normal task cannot run on the CPU before the startup callback re-arms
the sentinel. riscv_srmcfg_reset_cache() writes cpu_srmcfg = U32_MAX
before set_cpu_active(). Only the idle task and per-cpu kthreads run in
the window before, and they carry thread.srmcfg == 0. That means they
take the CPU default rcid and mcid, not stale ones.
Drew