Re: [PATCH v7 06/14] fs/resctrl: Add interface to disable a monitor event

From: Reinette Chatre

Date: Wed Jun 10 2026 - 18:27:16 EST


Hi Tony,

On 6/10/26 1:56 PM, Luck, Tony wrote:
> On Tue, Jun 09, 2026 at 04:02:55PM -0700, Reinette Chatre wrote:
>> On 6/9/26 10:21 AM, Luck, Tony wrote:
>>> On Mon, Jun 08, 2026 at 04:18:23PM -0700, Reinette Chatre wrote:
>>>> On 6/1/26 12:56 PM, Tony Luck wrote:
>>>>> In preparation for re-running AET enumeration on every mount, AET code must be
>>>>> able to disable events on unmount so the next mount starts from a clean slate.
>>>>>
>>>>> Add a file system interface for architecture to clear the enabled flag for
>>>>> a given event.
>>>>
>>>> This just verbatim describes what the patch does. It would be helpful to describe
>>>> how the flag is used by resctrl to support the first paragraph's implicit claim
>>>> that the enabled flag's value is not relevant when resctrl is unmounted.
>>>
>>> Revised commit:
>>> ---
>>> Subject: fs/resctrl: Add interface to disable a monitor event
>>>
>>> In preparation for re-running AET enumeration on every mount, AET code must be
>>> able to disable events on unmount so the next mount starts from a clean slate.
>>>
>>> Add a file system interface for architecture to reset architecture
>>> controlled fields of the given event. mon_event::enabled is only used
>>> during mount and at run time to check which events to include in file
>>> system objects. It is not used during unmount, so it is safe to clear
>>> it as part of the unmount flow.
>>
>> This introduces a general resctrl fs interface and makes some powerful
>> generalized statements in support of the interface but these statements
>> are only true for the AET events. Surely mon_event::enabled is used
>> during unmount since domains can come and go while resctrl is not mounted
>> and as the new comments explain there is significant state coordination that
>> needs to be done between these event callbacks and hotplug handlers.
>>
>> Similarly, the resctrl LLC occupancy worker keeps running while resctrl
>> is unmounted and depends on LLC occupancy event being enabled.
>>
>> Creating a resctrl fs generalized interface but motivating it with a
>> highly customized lens of usage without making that clear in the changelog
>> but instead just making grand claims of how safe this is seems underhanded.
>
> I can rewrite this commit comment to call out the limitations on event
> removal. Those are listed in the new kerneldoc comments that I added to
> the resctrl_disable_mon_event() declaration in <linux/resctrl.h> based on
> your feedback on previous version of this patch.

It is not necessary to duplicate the documentation added by the patch but
really should not make false statements like "It is not used during unmount"

>
> Is that what you are looking for here? Or are you suggesting that the
> new interface be less general?

I do not think it is necessary to make the interface less general but if you have
ideas then please share. My concern is just that the changelog should accurately
describe what the patch does.

Consider, for example, something like below. Please do not copy&paste what I write since
I already acknowledge it is not ideal. Please consider it just as a draft of how I think
this change can be described:
Without enforcement resctrl assumes that all events are enabled before any
domain is created. This assumption supports events that requires per-domain
state that is created with the domain via the architecture's CPU hotplug handlers.
resctrl does not support disabling of events since, in addition to coordination with
the domain management done by the architecture's CPU hotplug handlers, disabling
of events needs to be coordinated with the resctrl filesystem that may be mounted
and exposing enabled events.

The AET telemetry resources are enumerated and managed by the INTEL_PMT_TELEMETRY
driver. In preparation for INTEL_PMT_TELEMETRY to be loaded as module resctrl should
handle the scenario where INTEL_PMT_TELEMETRY is unloaded after resctrl
discovered the telemetry resources and enabled the associated events. This means
that resctrl needs to support disabling of events.

The architecture manages the domain and knows the resctrl mount state via
the resctrl_arch_pre_mount() callback. The architecture is thus in the best position
to know when it is safe to disable an event.

Allow the architecture to disable an event and trust it to only do so when resctrl
fs is not mounted and that it will ensure the event's state is cleaned up. Without being
able to do enforcement of safe event disabling, document what an architecture needs to
consider before using this new capability.

Reinette