Re: [PATCH 8/8] riscv_cbqri: Add CBQRI cache capacity-allocation platform driver

From: sashiko-bot

Date: Fri Jun 19 2026 - 14:41:12 EST


Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

New issues:
- [High] `get_cpu_cacheinfo_level()` fails to discover capacity controllers on split caches.

Pre-existing issues:
- [High] Global list `cbqri_controllers` is modified concurrently without a lock.
--

commit a409dc1e72a5139a209d981205820cfe3a0eb512
Author: Drew Fustini <fustini@xxxxxxxxxx>

riscv_cbqri: Add CBQRI cache capacity-allocation platform driver

The CBQRI core exposes riscv_cbqri_register_cc_dt() so a device-tree
platform driver can register a capacity controller whose cache topology
it already knows.

> diff --git a/drivers/resctrl/cbqri_capacity.c b/drivers/resctrl/cbqri_capacity.c
> new file mode 100644
> index 0000000000000..2172432eb3287
> --- /dev/null
> +++ b/drivers/resctrl/cbqri_capacity.c

[ ... ]

> + /*
> + * Associate the controller with its cache instance via
> + * cacheinfo. The matching cache provides the cache id and the
> + * set of harts that share the cache.
> + */
> + cache_id = -1;
> + cpus_read_lock();
> + for_each_online_cpu(cpu) {
> + struct cacheinfo *ci = get_cpu_cacheinfo_level(cpu, cache_level);
> +
> + if (ci && ci->fw_token == cache_np) {
> + cache_id = ci->id;
> + cpumask_copy(cpu_mask, &ci->shared_cpu_map);
> + break;
> + }
> + }
> + cpus_read_unlock();

[Severity: High]
Will this fail to discover capacity controllers on split caches?

get_cpu_cacheinfo_level() only returns the first cache leaf that matches
the requested level and stops searching. If a CPU has multiple caches at
the same level (e.g., split L1 Instruction and L1 Data caches), and the
capacity controller governs the second one, get_cpu_cacheinfo_level() will
return the first cache. Its fw_token will not match cache_np.

Consequently, the loop will never find the matching cache, cache_id will
remain -1, and the probe will incorrectly fail with -ENODEV.

[ ... ]

> + info.type = CBQRI_CONTROLLER_TYPE_CAPACITY;
> + info.addr = res->start;
> + info.size = resource_size(res);
> + info.rcid_count = rcid_count;
> + info.cache_id = cache_id;
> +
> + ret = riscv_cbqri_register_cc_dt(&info, cache_level, cpu_mask);
> + if (ret) {
> + dev_err(dev, "failed to register capacity controller: %d\n", ret);
> + goto out_free;
> + }

[Severity: High]
This is a pre-existing issue, but can this trigger list corruption if
multiple instances of the platform driver probe concurrently?

The probe function cbqri_capacity_probe() calls
riscv_cbqri_register_cc_dt(). Inside riscv_cbqri_register_cc_dt(), the newly
allocated controller is appended to the global cbqri_controllers list:

riscv_cbqri_register_cc_dt() {
...
list_add_tail(&ctrl->list, &cbqri_controllers);
return 0;
}

This modification happens without any locks. If the driver core probes
multiple CBQRI devices concurrently (e.g., due to asynchronous probing),
list_add_tail() will race on the shared list. Additionally, an asynchronous
probe could race with CPU hotplug events iterating over cbqri_controllers.

--
Sashiko AI review · https://sashiko.dev/#/patchset/20260619-dfustini-atl-sc-cbqri-dt-v1-0-e79a7723fab0@xxxxxxxxxx?part=8