Re: [PATCH RFC v5 03/18] riscv: add support for srmcfg CSR from Ssqosid extension
From: Drew Fustini
Date: Sun May 24 2026 - 23:55:30 EST
On Mon, May 25, 2026 at 12:30:07AM +0000, sashiko-bot@xxxxxxxxxx wrote:
> > +static int riscv_srmcfg_pm_notify(struct notifier_block *nb,
> > + unsigned long action, void *unused)
> > +{
> > + switch (action) {
> > + case CPU_PM_EXIT:
> > + case CPU_PM_ENTER_FAILED:
> > + __this_cpu_write(cpu_srmcfg, U32_MAX);
> > + break;
> > + }
> > + return NOTIFY_OK;
> > +}
>
> [Severity: High]
> Does this leave the currently executing task with an incorrect hardware CSR
> state?
>
> While the software cache is invalidated by setting cpu_srmcfg to U32_MAX,
> the hardware CSR_SRMCFG register is not explicitly reprogrammed during
> CPU_PM_EXIT.
>
> Since the hardware state may be lost or reset to an implementation-defined
> value upon resume, wouldn't the task running at resume time continue executing
> with the wrong QoS tag until it is eventually preempted?
This is not a problem. cache invalidation is sufficient. CSR write here
would be setting a value that __switch_to_srmcfg() will re-write moments
later.
> > +static int __init riscv_srmcfg_init(void)
> > +{
> > + int err;
> > +
> > + if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SSQOSID))
> > + return 0;
> > +
> > + /*
> > + * cpuhp_setup_state() invokes the startup callback locally on every
> > + * already-online CPU, so no separate seed loop is needed here.
> > + */
> > + err = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "riscv/srmcfg:online",
> > + riscv_srmcfg_online, NULL);
>
> [Severity: High]
> Should a teardown callback be provided to cpuhp_setup_state() to invalidate
> the cpu_srmcfg cache when a CPU goes offline?
The startup callback riscv_srmcfg_online() is the one that invalidates
the cache. A teardown callback that did the same thing on offline would be
redundant with the online seed.
> Without a teardown callback, a CPU going offline retains its previous
> cpu_srmcfg value. When brought back online, its hardware CSR is reset.
>
> Tasks scheduling before the CPU reaches the CPUHP_AP_ONLINE_DYN state might
> skip the CSR write if their thread srmcfg tag matches the stale cache. Could
> this allow those early context switches to bypass resource constraints by
> running with the hardware-reset CSR?
I will move the seed earlier to CPUHP_AP_STARTING so the cache is
invalidated before the CPU is scheduleable.
-Drew