[PATCH v4 5/7] fs/resctrl: Continue counter allocation after failure
From: Ben Horgan
Date: Thu Mar 26 2026 - 13:29:07 EST
In mbm_event mode, with mbm_assign_on_mkdir set to 1, when a user creates a
new CTRL_MON or MON group resctrl will attempt to allocate counters for
each of the supported mbm events on each resctrl domain. As counters are
limited, these allocations may fail. If an mbm_total_event counter
allocation fails then the mbm_total_event counter allocations for the
remaining domains are skipped and then the mbm_local_event counter
allocations are made. These failures don't cause the group allocation to
fail but the user should still be aware of them. A message for each
attempted allocation is reported in last_cmd_status but in order to fully
interpret that information the user needs to know what was skipped. This
is knowable as the domain list is sorted but it is undesirable to rely on
such implementation details.
Writes to mbm_L3_assignments using the wildcard format, <event>:*=e, will
also skip counter allocation after any counter allocation failure. Leading
once again to counters that are allocated but have no corresponding message
in last_cmd_status to indicate that.
When a new group is created always attempt to allocate all the counters
requested. Similarly, when a a wildcard assign operation is written to
mbm_L3_assignments, attempt to allocate all counters requested by that
particular operation.
For mbm_L3_assignments, continue to return an error on counter allocation
failure and for a write specifying multiple assign operations continue to
abort after the first failing assign operation.
Signed-off-by: Ben Horgan <ben.horgan@xxxxxxx>
---
fs/resctrl/monitor.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 6afa2af26ff7..3f33fff8eb7f 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -1209,9 +1209,10 @@ static int rdtgroup_alloc_assign_cntr(struct rdt_resource *r, struct rdt_l3_mon_
* NULL; otherwise, assign the counter to the specified domain @d.
*
* If all counters in a domain are already in use, rdtgroup_alloc_assign_cntr()
- * will fail. The assignment process will abort at the first failure encountered
- * during domain traversal, which may result in the event being only partially
- * assigned.
+ * will fail. When attempting to assign counters to all domains, carry on trying
+ * to assign counters after a failure since only some domains may have counters
+ * and the goal is to assign counters where possible. If any counter assignment
+ * fails, return the error from the last failing assignment.
*
* Return:
* 0 on success, < 0 on failure.
@@ -1224,9 +1225,11 @@ static int rdtgroup_assign_cntr_event(struct rdt_l3_mon_domain *d, struct rdtgro
if (!d) {
list_for_each_entry(d, &r->mon_domains, hdr.list) {
- ret = rdtgroup_alloc_assign_cntr(r, d, rdtgrp, mevt);
- if (ret)
- return ret;
+ int err;
+
+ err = rdtgroup_alloc_assign_cntr(r, d, rdtgrp, mevt);
+ if (err)
+ ret = err;
}
} else {
ret = rdtgroup_alloc_assign_cntr(r, d, rdtgrp, mevt);
--
2.43.0