Re: [PATCH RFC v6 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight

From: Drew Fustini

Date: Tue Jun 02 2026 - 13:37:14 EST


On Mon, Jun 01, 2026 at 08:57:34PM +0000, sashiko-bot@xxxxxxxxxx wrote:
> > + case RDT_RESOURCE_MB_WGHT: {
> > + u64 mweight;
> > +
> > + err = cbqri_read_mweight(ctrl, closid, &mweight);
>
> [Severity: High]
> This isn't a bug introduced by this patch, but the newly added
> cbqri_read_mweight() called here duplicates a pre-existing flaw.
>
> Can this corrupt data by writing to staging registers before ensuring the
> hardware is not busy?
>
> In cbqri_read_mweight(), the driver interacts with the hardware by writing to a
> staging register, writing a control register, and waiting for the BUSY flag to
> clear.
>
> If a previous operation times out (such as the 1ms limit in
> cbqri_wait_busy_flag()), the driver returns -EIO and releases ctrl->lock, but
> the hardware might still be processing.
>
> The next caller acquires the lock and immediately calls cbqri_set_bc_bw_alloc()
> to read-modify-write the staging register before cbqri_bc_alloc_op() verifies
> the BUSY flag is 0.
>
> Would modifying the staging register of an in-flight hardware operation cause
> undefined behavior, misconfiguration, or WPRI field corruption?

This is not a problem:

- cbqri_apply_bc_field() waits for BUSY = 0 before staging so there is
no corruption on the apply path.
- cbqri_bc_alloc_op() waits at entry, so a staging write only lands
during BUSY on an already-faulted controller.
- cbqri_set_bc_bw_alloc() stages both fields from the caches, so there
is no unmodified field to corrupt.

> > +static void cbqri_detach_cpu_from_one_bw_res(struct cbqri_controller *ctrl,
> > + enum resctrl_res_level rid,
> > + unsigned int cpu)
> > +{
> > + struct cbqri_resctrl_res *hw_res = &cbqri_resctrl_resources[rid];
> > + struct rdt_resource *res = &hw_res->resctrl_res;
> > + struct rdt_ctrl_domain *domain;
> > + int dom_id = ctrl->mem.prox_dom;
> > +
> > + lockdep_assert_held(&cbqri_domain_list_lock);
> > +
> > + if (!hw_res->ctrl)
> > + return;
> > +
> > + domain = cbqri_find_ctrl_domain(&res->ctrl_domains, dom_id);
> > + if (!domain || !cpumask_test_cpu(cpu, &domain->hdr.cpu_mask))
> > + return;
> > +
> > + cpumask_clear_cpu(cpu, &domain->hdr.cpu_mask);
> > + if (cpumask_empty(&domain->hdr.cpu_mask)) {
> > + resctrl_offline_ctrl_domain(res, domain);
>
> This isn't a bug introduced by this patch, but the newly added
> cbqri_detach_cpu_from_one_bw_res() replicates a broken lock ordering.
>
> Could this lead to an AB-BA deadlock between rdtgroup_mutex and
> cbqri_domain_list_lock?
>
> If a user reading a monitor data file triggers rdtgroup_mondata_show(), it
> acquires rdtgroup_mutex, then calls mon_event_read() ->
> resctrl_arch_rmid_read(), which acquires cbqri_domain_list_lock.
>
> Concurrently, when a CPU is taken offline, cbqri_resctrl_offline_cpu()
> acquires cbqri_domain_list_lock, then calls cbqri_detach_cpu_from_all_ctrls()
> -> cbqri_detach_cpu_from_one_bw_res(). If the domain becomes empty, it calls
> resctrl_offline_ctrl_domain(), which attempts to acquire rdtgroup_mutex.
>
> Would these two paths race and cause a permanent AB-BA deadlock?

This is not a problem. cpu_hotplug_lock serializes the two lock orderings:

- Every rdtgroup_mutex -> cbqri_domain_list_lock path runs under
cpus_read_lock
- The cbqri_domain_list_lock -> rdtgroup_mutex path is the cpuhp
online/offline callback under cpus_write_lock.

Those are mutually exclusive, so the orders never overlap.
>
> > + list_del(&domain->hdr.list);
> > + kfree(container_of(domain, struct cbqri_resctrl_dom,
> > + resctrl_ctrl_dom));
>
> [Severity: High]
> This is a pre-existing issue, but the new cbqri_detach_cpu_from_one_bw_res()
> replicates it here.
>
> Can this result in a Use-After-Free on the res->ctrl_domains list?
>
> VFS read paths in fs/resctrl (like rdtgroup_schemata_show()) traverse the
> res->ctrl_domains list using list_for_each_entry while holding only
> rdtgroup_mutex.
>
> When the CPU mask becomes empty, resctrl_offline_ctrl_domain() is called above,
> which takes and releases rdtgroup_mutex. After it returns, list_del() and
> kfree() are executed without holding rdtgroup_mutex or using RCU primitives.
>
> Could a concurrent list traversal under rdtgroup_mutex dereference the freed
> domain pointer and crash?

This is not a problem:

- The VFS readers of res->ctrl_domains hold cpus_read_lock in addition
to rdtgroup_mutex
- The detach that does list_del() and kfree() is the cpuhp offline
callback under cpus_write_lock.
- They cannot run concurrently so list_del()/kfree() is fine.

Drew