[PATCH v4 04/10] fs/resctrl: Fix use-after-free during unmount
From: Reinette Chatre
Date: Tue Jun 02 2026 - 23:28:55 EST
From: Tony Luck <tony.luck@xxxxxxxxx>
During unmount or failure teardown all mon_data structures that contain
monitoring event file private data are freed after which kernfs nodes are
removed. However, the RDT_DELETED flag is never set for the statically
allocated default resource group.
A concurrent reader of an event file associated with the default resource
group may, after dropping kernfs active protection, block on rdtgroup_mutex
while unmount proceeds to free the file private data and destroy the kernfs
node without waiting for the reader.
When the mutex is released, the reader wakes up, observes that RDT_DELETED
is not set for the default group, and dereferences the already-freed
file private data.
The scenario can be depicted as follows:
CPU0 CPU1
/*
* Default resource group's
* monitoring data accessible via
* kernfs file with kernfs_node::priv
* pointing to a struct mon_data.
* User opens the file for reading.
*/
rdtgroup_mondata_show() /* arch encounters fatal error */
rdtgroup_kn_lock_live() resctrl_exit()
atomic_inc(&rdtgroup_default.waitcount) cpus_read_lock()
kernfs_break_active_protection(kn) mutex_lock(&rdtgroup_mutex)
cpus_read_lock() resctrl_fs_teardown()
mutex_lock(&rdtgroup_mutex) rmdir_all_sub()
mon_put_kn_priv()
/* Delete all mon_data structures */
rdtgroup_destroy_root()
kernfs_destroy_root()
rdtgroup_default.kn = NULL
mutex_unlock(&rdtgroup_mutex)
/*
* rdtgroup_default.flags is empty so
* rdtgroup_kn_lock_live() returns
* &rdtgroup_default
*/
md = of->kn->priv;
/* md points to freed mon_data */
Set RDT_DELETED for the default group unconditionally since the flag does
not lead to the freeing of this statically allocated group.
Do not allow a new resctrl mount if there are any waiters on default group
of previous mount. A new mount will re-initialize the default group that
would appear to waiters from previous mount as though the default group is
accessible causing them to access the mon_data structures from the previous
mount that have been removed.
Fixes: 2a6566038544 ("x86/resctrl: Expand the width of domid by replacing mon_data_bits")
Reported-by: Sashiko <sashiko-bot@xxxxxxxxxx>
Closes: https://sashiko.dev/#/patchset/20260508182143.14592-1-tony.luck%40intel.com?part=2 [1]
Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
Signed-off-by: Reinette Chatre <reinette.chatre@xxxxxxxxx>
Reviewed-by: Chen Yu <yu.c.chen@xxxxxxxxx>
---
Changes since V2:
- Rewrite changelog to not describe code as much.
- Rework changelog to switch to "Reported-by/Closes".
- Merge the duplicate rdtgroup_remove() comment with the function comment.
- Fix changelog to not mention that RDT_DELETED flag is set conditionally.
- Change "Fixes:" tag to point to commit that introduced dynamically
allocated mon_data this bug involves.
Changes since V3:
- Depict the race. (Chenyu)
- Add Chenyu's Reviewed-by tag.
- Changelog grammar fixes.
---
fs/resctrl/rdtgroup.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 31cfb54a5488..809f0965474c 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -585,14 +585,20 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
*
* On resource group creation via a mkdir, an extra kernfs_node reference is
* taken to ensure that the rdtgroup structure remains accessible for the
- * rdtgroup_kn_unlock() calls where it is removed.
+ * rdtgroup_kn_unlock() calls where it is removed. The default group is
+ * statically allocated: it does not have an extra reference but will have
+ * RDT_DELETED set on unmount to support safe access to its associated files
+ * via rdtgroup_kn_lock_live/rdtgroup_kn_unlock().
*
- * Drop the extra reference here, then free the rdtgroup structure.
+ * For all but the default group: drop the extra reference, then free the
+ * rdtgroup structure.
*
* Return: void
*/
static void rdtgroup_remove(struct rdtgroup *rdtgrp)
{
+ if (rdtgrp == &rdtgroup_default)
+ return;
kernfs_put(rdtgrp->kn);
kfree(rdtgrp);
}
@@ -2975,6 +2981,7 @@ static void resctrl_fs_teardown(void)
mon_put_kn_priv();
rdt_pseudo_lock_release();
rdtgroup_default.mode = RDT_MODE_SHAREABLE;
+ rdtgroup_default.flags = RDT_DELETED;
closid_exit();
schemata_list_destroy();
rdtgroup_destroy_root();
@@ -3000,6 +3007,12 @@ static int rdt_get_tree(struct fs_context *fc)
goto out;
}
+ /* Avoid races from pending operations from a previous mount */
+ if (atomic_read(&rdtgroup_default.waitcount) != 0) {
+ ret = -EBUSY;
+ goto out;
+ }
+
ret = setup_rmid_lru_list();
if (ret)
goto out;
@@ -4275,6 +4288,7 @@ static int rdtgroup_setup_root(struct rdt_fs_context *ctx)
ctx->kfc.root = rdt_root;
rdtgroup_default.kn = kernfs_root_to_node(rdt_root);
+ rdtgroup_default.flags = 0;
return 0;
}
--
2.50.1