Re: [PATCH v4 6/6] x86/resctrl: Add support for L3 occupancy monitoring via RMID MMIO read

From: Chen, Yu C

Date: Wed Jun 24 2026 - 05:07:40 EST


Hi Reinette,

On 6/24/2026 12:48 AM, Reinette Chatre wrote:
Hi Chenyu,

On 6/22/26 10:00 PM, Chen, Yu C wrote:
Hi Reinette,

On 6/23/2026 5:30 AM, Reinette Chatre wrote:
Hi Chenyu,
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 97c2f6bc7a5f..9b3b03279dd8 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -41,6 +41,8 @@ struct resctrl_pqr_state {
   };
     bool erdt_enabled(void);
+struct rdt_domain_hdr;
+int erdt_mon_read(struct rdt_domain_hdr *hdr, int ev_id, int rmid, u64 *val);
     DECLARE_PER_CPU(struct resctrl_pqr_state, pqr_state);
   diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 90730f0851fa..fe812f7190fc 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -965,7 +965,7 @@ static __init bool get_rdt_mon_resources(void)
       bool ret = false;
         if (rdt_cpu_has(X86_FEATURE_CQM_OCCUP_LLC)) {
-        resctrl_enable_mon_event(QOS_L3_OCCUP_EVENT_ID, false, 0, NULL);
+        resctrl_enable_mon_event(QOS_L3_OCCUP_EVENT_ID, erdt_enabled(), 0, NULL);
           ret = true;
       }
       if (rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL)) {

As mentioned in patch #1, when erdt_enabled() is true the enumeration still proceeds to
enumerate the monitoring properties via CPUID to discover the number of RMIDs that the
*MSR* supports and use it as the maximum RMID (and thus the maximum number of registers)
that MMIO supports?


OK, will switch to the maximum RMID exposed by ACPI table, if erdt_enabled() is true.

I believe the issue is larger than just the RMID enumeration. The CPUID and ACPI enumeration
appears to be fully intertwined. Taking a closer look at what above code does:
it checks *CPUID* whether CMT is enabled and then enables the LLC occupancy event to blindly use
MMIO if ERDT is enabled, irrespective of whether the ERDT tables include a cache monitoring table
or not. How is it guaranteed that if ERDT is enabled that there is a cache monitoring table?
Should it not be the existence of the ACPI cache monitoring table and its properties that
determines whether the LLC occupancy counter using MMIO registers should be enabled?


I see. How about replacing erdt_enabled() with fine-grained helper functions such as
erdt_has_cmrc(), erdt_has_mmrc(), and erdt_has_marc()? The latter two will be added
later for region-aware MBM/MBA. CMRC, MMRC and MARC are not guaranteed to coexist,
so splitting them into separate helpers would offer finer control.
I am concerned where this is headed since it looks to me as though the plan is to
sprinkle these finer grained checks throughout resctrl.

The erdt_has_* helper functions would ideally live under arch/x86/kernel/cpu/resctrl/
rather than the generic fs/resctrl directory, as architecture-specific code in the
former path is allow to see erdt logic I suppose? That said, since there are a large
number of such routines, introduce a dedicated helper to handle this uniformly would
be better(using arch_priv)

To me this sounds complicated
and error prone. Consider that resctrl_enable_mon_event() has an arch_priv parameter. To me
this seems to be the appropriate place for the architecture to give itself the needed
information about how to read the event.


OK, if I understand correctly, we can use the following logic to hide
erdt from monitor read:

/*
* helper to get the erdt's monitor arch_priv,
* defined in erdt.c, NULL in other place.
*
* caller doesn't know about CMRC
*/
void *get_evt_priv(enum resctrl_event_id eventid)
{
if (!erdt_enabled())
return NULL;

switch (eventid) {
case QOS_L3_OCCUP_EVENT_ID:
return cmrc_priv_valid ? &cmrc_priv : NULL;
default:
return NULL;
}
}

arch/x86/kernel/cpu/resctrl/core.c
get_rdt_mon_resources():
if (rdt_cpu_has(X86_FEATURE_CQM_OCCUP_LLC)) {
void *priv = get_evt_priv(QOS_L3_OCCUP_EVENT_ID);

resctrl_enable_mon_event(QOS_L3_OCCUP_EVENT_ID,
priv != NULL, 0, priv);
}

arch/x86/kernel/cpu/resctrl/monitor.c
resctrl_arch_rmid_read():
if (arch_priv)
return erdt_mon_read(hdr, eventid, rmid, val);
return arch_l3_read_event(hdr, rmid, eventid, val, r);

So in this way, there is no event-type checks in the read
path. Adding MMRC later only extends the switch in get_evt_priv().

thanks,
Chenyu