Re: [PATCH v7 05/14] x86/resctrl: Stop setting event_group::force_off on RMID shortage
From: Luck, Tony
Date: Wed Jun 10 2026 - 16:05:49 EST
On Tue, Jun 09, 2026 at 04:02:11PM -0700, Reinette Chatre wrote:
> Hi Tony,
>
> On 6/9/26 9:51 AM, Luck, Tony wrote:
> > On Mon, Jun 08, 2026 at 04:16:58PM -0700, Reinette Chatre wrote:
> >> Hi Tony,
> >>
> >> On 6/1/26 12:56 PM, Tony Luck wrote:
> >>> Drop the force_off assignment from all_regions_have_sufficient_rmid().
> >>
> >> Apart from this being obvious from the patch, this is not how an x86 changelog
> >> should be structured. Please see "Changelog" in
> >> Documentation/process/maintainer-tip.rst
> >
> > Revised version:
> > ---
> > Subject: x86/resctrl: Fix usage of event_group::force_off in AET
> >
> > The kernel command line option "rdt=" is used to force enable or disable
> > individual resctrl features.
> >
> > There are two problems with usage in the AET (Application Energy
> > Telemetry) feature:
>
> Above seems incomplete. Could it be "two problems with usage of the rdt= kernel
> command line ..." or "two problems with the "rdt=" usage ..." or ...?
>
> > 1) If the user specifies both enabling and disabling of the same feature
> > (e.g. "rdt=energy,!energy") the request to enable should override the
> > request to disable (for consistency with other features).
> > 2) event_group::force_off is set true in all_regions_have_sufficient_rmid()
> > which will cause problems when AET features are enumerated on each mount
> > of the resctrl file system
>
> Two comments:
> - In general we should aim to make the changelog as specific as possible to
> prevent reader needing to do any deciphering. This makes the review easier
> and faster. So, please avoid general things like "will cause problems" that
> requires reader to figure out the problems (plural!) on their own. Instead
> just be specific about what the problems are.
> - Is (2) still a problem when (1) is fixed? An underlying question is: if an
> event group is successfully enumerated during resctrl mount, will it always
> enumerate with identical properties on every subsequent mount?
>
> >
> > Fix the first issue by checking event_group::force_on in enable_events().
>
> "the first issue" seems vague to me while the rest describes the code that can be
> seen from the patch. How about something like below to help describe what the code
> change accomplishes:
> Enumerate the event group if user space overrides any disable of the event group.
>
> >
> > Fix the other issue by dropping the assignment to event_group::force_off.
>
> "the other issue" is very vague. Here it will also help to be specific. Although
> per earlier comment the need for this fix is no longer clear to me?
>
> >
> > ---
> >
> >>
> >> Please note that, after the preparatory fixes that are expected to land separately,
> >> this will be the first patch of this series ... having this first patch just
> >> kick off with "what changes" without any context is a difficult way for a
> >> reviewer to start considering this piece of work.
> >>
> >>> This preserves current single-enumeration behaviour while preparing for
> >>> the upcoming per-mount enumeration, where latching force_off would
> >>> incorrectly suppress re-enumeration on subsequent mounts - even when the
> >>> user explicitly requested the feature via "rdt={feature}".
> >>
> >> I believe that user space should still expect that rdt= options behave
> >> consistently. So if user space for some reason provides: rdt=!mbm,mbm,!energy,energy
> >> on a system that supports both then it should not be the case that one is enabled and the
> >> other not.
> >>
> >> I this think that, for example, enable_events() should start with:
> >>
> >> if (e->force_off && !e->force_on)
> >> return false;
> >
> > OK.
> >
> >>
> >>> Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
> >>> ---
> >>> arch/x86/kernel/cpu/resctrl/intel_aet.c | 8 +++-----
> >>> 1 file changed, 3 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/arch/x86/kernel/cpu/resctrl/intel_aet.c b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> >>> index 89b8b619d5d5..e2af700bca04 100644
> >>> --- a/arch/x86/kernel/cpu/resctrl/intel_aet.c
> >>> +++ b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> >>> @@ -60,8 +60,8 @@ struct pmt_event {
> >>> * data for all telemetry regions of type @pfname.
> >>> * Valid if the system supports the event group,
> >>> * NULL otherwise.
> >>> - * @force_off: True when "rdt" command line or architecture code disables
> >>> - * this event group due to insufficient RMIDs.
> >>> + * @force_off: True when "rdt" command line disables this event group
> >>> + * to avoid system limitations due to insufficient RMIDs.
> >>
> >> >From what I understand it is only architecture that can disable an event group
> >> "to avoid system limitations due to insufficient RMIDs" and this capability is removed
> >> in this patch. Above implies that this capability now needs to be implemented
> >> by userspace, which I do not think is accurate?
> >
> > Should I just drop the second line? The user may have various other reasons
> > why they want to disable this event group.
>
> Now that enable_events() starts with a check of event_group::force_on it seems that
> the existing behavior of event_group::force_off to also reflect architecture
> disable can be maintained?
OK. I'll drop this kerneldoc change and keep the setting of force_off = true
when there are insufficient RMIDs.
>
> > Combined with the code change you suggest above, this would describe the
> > use of these two fields. "force_off" disables, "force_on" enables (and
> > overrides any "force_off").
>
> I believe the "force_on" description below already accurately describes how it
> overrides an earlier disable.
>
> >
> >>
> >>> * @force_on: True when "rdt" command line overrides disable of this
> >>> * event group.
> >>> * @guid: Unique number per XML description file.
> >>> @@ -214,10 +214,8 @@ static bool all_regions_have_sufficient_rmid(struct event_group *e, struct pmt_f
> >>> if (!p->regions[i].addr)
> >>> continue;
> >>> tr = &p->regions[i];
> >>> - if (tr->num_rmids < e->num_rmid) {
> >>> - e->force_off = true;
> >>> + if (tr->num_rmids < e->num_rmid)
> >>> return false;
> >>> - }
> >>> }
> >>>
> >>> return true;
>
> Reinette