Re: [PATCH v2] fs/resctrl: Fix deadlock for errors during mount
From: Reinette Chatre
Date: Thu May 07 2026 - 20:23:58 EST
Hi Tony,
On 5/7/26 4:45 PM, Luck, Tony wrote:
> On Thu, May 07, 2026 at 11:30:53AM -0700, Reinette Chatre wrote:
>> On 5/7/26 10:47 AM, Luck, Tony wrote:
>>> Reinette,
>>>
>>> This looks promising.
>>>
>>> I'll split out the missing call to mon_put_kn_priv(); into its own
>>> patch before the deadlock fix.
>>>
>>> Going to re-run the tests I did before forcing kernfs_get_tree()
>>> to fail early without setting new_sb_created, and also late.
>>
>> Thank you very much.
>
> Tests pass.
Great. Thank you very much.
>
>>>
>>>> +/*
>>>> + * Temporary forward declaration for testing only. Move functions instead.
>>>> + */
>>>> +static void resctrl_unmount(void);
>>>> +static void mon_put_kn_priv(void);
>>>
>>> Question: How much are forward declarations hated? And how to handle
>>> this?
>>
>> I am not aware of forward declarations being hated and I am not aware of
>> documented tip rules about this. Even so, I do find it cleaner if they
>> can be avoided. To handle this a prep patch that just moves the code
>> without any functional change should work?
>>
>>> Moving the functions around in the same patch really obscures the
>>> actual change. Is it OK to have a patch to make the functional
>>> change including the forward declarations. Then a separate commit
>>> that does the re-order (where it is obvious that functions are
>>> being picked up and moved without any code changes)?
>>
>> My preference would be a prep patch that does the move with new
>> capability built on top. It seems unnecessary to me to add a forward
>> declaration in one patch just to remove it in a following patch.
>> I agree that moving code as part of functional change should be avoided.
>
> OK. I have a three patch series:
>
> 1) Pure code move, no changes [while I picked up a block of functions
> and moved them earlier, "git show" presents the move in terms of a
> different set of functions moving]
If the patches seem awkward it may be worthwhile to try --patience or
--histogram to see if readability improves.
>
> 2) Add missing mon_put_kn_priv()
>
> 3) Fix the mount error deadlock
>
> Internal AI scans have only two complaints. Both appear spurious;
>
> First it says that:
>
> if (!ctx->kfc.new_sb_created)
> resctrl_unmount();
> should be:
>
> if (ret && !ctx->kfc.new_sb_created)
> resctrl_unmount();
>
> It admits that current code can't return ret == 0 (success) while
> failing to create a super block. But argues in some twisted future
> that might be possible.
I think kernfs_get_tree() can indeed return 0 without creating a superblock
when this is a re-mount but since resctrl does not call kernfs_get_tree()
on a remount (instead returning -EBUSY) this should not happen. If indeed
something changes with kernfs API to cause this to happen then it would be
safer to return success with state than return success with all state removed.
While the change is not necessary in current flow it does improve robustness
while making the flow easier to understand so I am ok with it. I think a comment
above the check will be helpful though.
>
> Second:
>
> Code used to call rdt_last_cmd_clear() unconditionally on success or
> failure of the mount. Now the call for the end-of-function error path
> is gone.
>
> This is true, but it doesn't matter. If the filesystem did not mount
> then there is no /sys/fs/resctrl/info/last_cmd_status file to leak an
> old error string.
ack. I see this similar to unmount where this state is not cleared
either.
>
> I just need to write up the cover letter.
Thank you very much.
Reinette