Re: [PATCH RFC v6 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring

From: Drew Fustini

Date: Tue Jun 02 2026 - 00:50:14 EST

On Mon, Jun 01, 2026 at 08:58:37PM +0000, sashiko-bot@xxxxxxxxxx wrote:
> > + /*
> > + * cbqri_domain_list_lock serialises the list walk against
> > + * cbqri_detach_cpu_from_ctrl_domains().
> > + */
> > + mutex_lock(&cbqri_domain_list_lock);
>
> [Severity: High]
> During CPU hotplug, cbqri_resctrl_online_cpu() acquires
> cbqri_domain_list_lock and eventually calls resctrl_online_mon_domain()
> which acquires the core framework rdtgroup_mutex.
>
> However, when reading monitoring data, the core framework acquires
> rdtgroup_mutex and then invokes resctrl_arch_rmid_read() which acquires
> cbqri_domain_list_lock here.
>
> Does this establish an AB-BA lock inversion between cbqri_domain_list_lock
> and rdtgroup_mutex?
>
> [Severity: High]
> Additionally, when a CPU goes offline, cbqri_resctrl_offline_cpu() acquires
> cbqri_domain_list_lock before initiating domain teardown.
>
> If the domain is torn down, the resctrl core flushes busy RMIDs by calling
> __check_limbo(), which invokes this resctrl_arch_rmid_read() hook.
>
> Since this hook unconditionally attempts to acquire cbqri_domain_list_lock,
> does this cause a self-deadlock when offlining a CPU with active monitoring
> traffic?

For the next rev, I will drop the list lock around the teardown so the
limbo flush does not re-enter it.

> > @@ -500,6 +582,13 @@ static int cbqri_resctrl_control_init(struct cbqri_resctrl_res *cbqri_res)
> [ ... ]
> > + if (ctrl->mon_capable && res->rid == RDT_RESOURCE_L3) {
> > + res->mon_scope = RESCTRL_L3_CACHE;
> > + resctrl_enable_mon_event(QOS_L3_OCCUP_EVENT_ID,
> > + false, 0, NULL);
> > + res->mon_capable = true;
> > + }
>
> [Severity: Medium]
> The L3 occupancy event is registered with any_cpu set to false. This
> requires the core to read the event from a specific CPU using
> smp_call_function_any().
>
> If that CPU is configured as nohz_full, the read executes in an IPI context
> with interrupts disabled. Since resctrl_arch_rmid_read() correctly returns
> -EIO when irqs_disabled(), will userspace reads fail on nohz_full CPUs?
>
> Could any_cpu be true here to avoid cross-CPU IPIs, since the MMIO reads
> can be safely executed from any CPU?

I will switch QOS_L3_OCCUP and QOS_L3_MBM_TOTAL to any_cpu = true in the
next rev.

> > @@ -510,6 +599,7 @@ static int cbqri_resctrl_control_init(struct cbqri_resctrl_res *cbqri_res)
> [ ... ]
> > + l3_ctrl = cbqri_resctrl_resources[RDT_RESOURCE_L3].ctrl;
> > + if (l3_ctrl && l3_ctrl->mon_capable)
> > + max_rmid = min(max_rmid, l3_ctrl->mcid_count);
> > +
> > + if (!exposed_mon_capable) {
> > + max_rmid = 1;
> > + return;
> > }
>
> [Severity: High]
> If the hardware has an L2 monitoring controller but no L3 monitoring,
> exposed_mon_capable becomes true but max_rmid is never narrowed from its
> initial U32_MAX value.
>
> When the core framework mounts the filesystem, it will query
> resctrl_arch_system_num_rmid_idx() which returns max_rmid. Will this
> cause an immediate OOM panic when it attempts to allocate
> U32_MAX * sizeof(struct rmid_entry)?

For the next rev, I will only set exposed_mon_capable when an event is
actually enabled and clamp max_rmid.

> > @@ -560,13 +673,89 @@ static struct rdt_ctrl_domain *cbqri_create_ctrl_domain(struct cbqri_controller
> [ ... ]
> > +err_offline:
> > + /*
> > + * cancel_delayed_work avoids deadlocking against the cqm_limbo
> > + * worker which takes cpus_read_lock while this hotplug callback
> > + * already holds cpus_write_lock. mbm_over is only
> > + * INIT_DELAYED_WORK'd when MBM_TOTAL was enabled, so gate the
> > + * cancel on the same condition to avoid touching a zeroed work
> > + * struct.
> > + */
> > + cancel_delayed_work(&mon_dom->cqm_limbo);
> > + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
> > + cancel_delayed_work(&mon_dom->mbm_over);
> > + resctrl_offline_mon_domain(res, &mon_dom->hdr);
> > +err_listdel:
> > + list_del(&mon_dom->hdr.list);
> > +err_free:
> > + kfree(mon_dom);
> > + return err;
> > +}
>
> [Severity: High]
> If a worker thread is currently executing but blocked waiting for
> cpus_read_lock(), cancel_delayed_work() will return immediately because
> the work is no longer pending in the queue.
>
> After the hotplug thread drops cpus_write_lock(), the worker could unblock
> and continue executing, accessing the mon_dom structure after it was freed.
> Does this create a use-after-free window?

This is not a problem. The non-sync cancel is the resctrl core's own pattern

Drew