Re: [PATCH 3/4] fs/resctrl: Fix deadlock for errors during mount

From: Reinette Chatre

Date: Mon May 11 2026 - 18:53:47 EST

Hi Tony,

On 5/8/26 11:21 AM, Tony Luck wrote:
> @@ -3173,26 +3200,8 @@ static int rdt_init_fs_context(struct fs_context *fc)
>
> static void rdt_kill_sb(struct super_block *sb)
> {
> - struct rdt_resource *r;
> -
> - cpus_read_lock();
> - mutex_lock(&rdtgroup_mutex);
> -
> - rdt_disable_ctx();
> -
> - /* Put everything back to default values. */
> - for_each_alloc_capable_rdt_resource(r)
> - resctrl_arch_reset_all_ctrls(r);
> -
> - resctrl_fs_teardown();
> - if (resctrl_arch_alloc_capable())
> - resctrl_arch_disable_alloc();
> - if (resctrl_arch_mon_capable())
> - resctrl_arch_disable_mon();
> - resctrl_mounted = false;
> + resctrl_unmount();
> kernfs_kill_sb(sb);
> - mutex_unlock(&rdtgroup_mutex);
> - cpus_read_unlock();
> }
>
> static struct file_system_type rdt_fs_type = {

sashiko reports [1] that this fix unmasks a use-after-free. This new issue looks to be
legitimate. I am not comfortable with sashiko's proposed fix since it flips the current
teardown ordering that is based on kernfs guidance per kernfs_kill_sb() function comments:
* This can be used directly for file_system_type->kill_sb(). If a kernfs
* user needs extra cleanup, it can implement its own kill_sb() and call
* this function at the end.

Something else to also consider is how interference from resctrl_exit() may occur now
that its caller has been merged. What do you instead think of something like below? The
main idea is that it takes an extra reference to the root kn before calling
kernfs_get_tree(), which it releases upon return. I find such a symmetric solution
relying on reference counts instead of code ordering easier to reason about and by
taking the extra reference with rdtgroup_mutex held it can handle interference from
resctrl_exit(). What do you think?

diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 1ec7ff0389d2..4c4f98ebf4a9 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -3035,6 +3035,7 @@ static int rdt_get_tree(struct fs_context *fc)
{
struct rdt_fs_context *ctx = rdt_fc2context(fc);
unsigned long flags = RFTYPE_CTRL_BASE;
+ struct kernfs_node *rdt_root_kn;
struct rdt_l3_mon_domain *dom;
struct rdt_resource *r;
int ret;
@@ -3119,6 +3120,19 @@ static int rdt_get_tree(struct fs_context *fc)
RESCTRL_PICK_ANY_CPU);
}

+ rdt_root_kn = rdtgroup_default.kn;
+ /*
+ * Keep root kn alive across kernfs_get_tree(). If kernfs_get_tree()
+ * fails after superblock is created (ctx->kfc.new_sb_created is true)
+ * kernfs_get_tree() will call rdt_kill_sb() itself.
+ * rdt_kill_sb()->resctrl_unmount()->kernfs_destroy_root() is followed
+ * by kernfs_kill_sb() that still needs to dereference root kn
+ * after kernfs_destroy_root().
+ *
+ * Obtain reference with locks held to protect against interference
+ * from resctrl_exit().
+ */
+ kernfs_get(rdt_root_kn);
rdt_last_cmd_clear();
mutex_unlock(&rdtgroup_mutex);
cpus_read_unlock();
@@ -3130,6 +3144,7 @@ static int rdt_get_tree(struct fs_context *fc)
*/
if (!ctx->kfc.new_sb_created)
resctrl_unmount();
+ kernfs_put(rdt_root_kn);
return ret;

out_mondata:

Reinette

[1] https://sashiko.dev/#/patchset/20260508182143.14592-1-tony.luck%40intel.com?part=3