Re: [PATCH v3 31/38] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
From: James Morse
Date: Fri Aug 02 2024 - 13:28:48 EST
Hi Reinette,
On 08/07/2024 18:47, Reinette Chatre wrote:
> On 7/4/24 9:41 AM, James Morse wrote:
>> On 28/06/2024 17:53, Reinette Chatre wrote:
>>> On 6/14/24 8:00 AM, James Morse wrote:
>>>> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
>>>> resctrl can't be built as a module, and the kernfs helpers are not exported
>>>> so this is unlikely to change. MPAM has an error interrupt which indicates
>>>> the MPAM driver has gone haywire. Should this occur tasks could run with
>>>> the wrong control values, leading to bad performance for impoartant tasks.
>>>
>>> impoartant -> important
>>>
>>>> The MPAM driver needs a way to tell resctrl that no further configuration
>>>> should be attempted.
>>>>
>>>> Using resctrl_exit() for this leaves the system in a funny state as
>>>> resctrl is still mounted, but cannot be un-mounted because the sysfs
>>>> directory that is typically used has been removed. Dave Martin suggests
>>>> this may cause systemd trouble in the future as not all filesystems
>>>> can be unmounted.
>>>>
>>>> Add calls to remove all the files and directories in resctrl, and
>>>> remove the sysfs_remove_mount_point() call that leaves the system
>>>> in a funny state. When triggered, this causes all the resctrl files
>>>> to disappear. resctrl can be unmounted, but not mounted again.
>>
>>> I am not familiar with these flows so I would like to confirm ...
>>> In this scenario the resctrl filesystem will be unregistered, are
>>> you saying that it is possible to unmount a filesystem after it has
>>> been unregistered?
>>
>> Counter-intuitively: yes.
>>
>> The rules are described in fs/filesystems.c: We can access the members of the struct
>> file_system_type if the list lock is held, or a reference is held to the module. This is
>> how /proc/mounts is able to print the filesystem name from struct file_system_type without
>> taking the lock - it holds a reference to any module to prevent the structure from being
>
> hmmm ... does this mean I am supposed to find calls to try_module_get() in the flow from
> mounts_open_common()?
There may be, but when a filesystem is mounted the code in super.c holds a reference to
the filesystem - which translates to a reference on the module/filesystem->owner.
My point was only that its possible to unregister a filesystem while its mounted. The
reference counting takes care of this - and is unnecessary in our case.
>> freed. Because resctrl can't be built as a module, we can say there is always a reference
>> held, and we can never free struct file_system_type.
>
> unregister_filesystem() continues to be called and as I understand in new MPAM usages will be
> called during runtime. unregister_filesystem() comments state "Once this function has
> returned
> the &struct file_system_type structure may be freed or reused.". Could you please
> highlight to me
> what gives the confidence of "we can say there is always a reference held"? Could you please
> point to me where that reference is obtained that will prevent the structure from being
> freed?
I think we are rat-holing on something that doesn't matter:
* resctrl can't be built as a module - it is always built in.
* rdt_fs_type is therefore part of the kernel data section - it can't be freed.
* likewise the code that is part of resctrl can't be freed either.
Thanks,
James