Re: [PATCH v4 5/7] x86/resctrl: Resolve PMT and TPMI symbols at runtime
From: Reinette Chatre
Date: Fri Apr 03 2026 - 20:56:46 EST
Hi Tony,
On 3/30/26 2:43 PM, Tony Luck wrote:
> The resctrl subsystem is always built-in. Currently, it has a direct
> dependency on INTEL_PMT_TELEMETRY and INTEL_TPMI for Application Event
> Telemetry (AET) features. This forces these drivers to be built-in as well,
> even though they are logically separate.
>
> Switch to using symbol_request() to resolve AET-related symbols at
> runtime. This allows PMT and TPMI to be built as loadable modules.
>
> Update the mount/unmount logic to manage these references:
> - Acquire symbol references and pin the modules during resctrl mount.
> - Ensure AET structures are cleaned up and events disabled on unmount.
> - Release symbol references and unpin modules on unmount or if
> mounting fails.
If I remember correctly resctrl depends on the mount being done much later
than driver initialization to ensure all PMT load and enumeration is
complete by the time resctrl fs is mounted.
Now it looks like these modules can be loaded dynamically on resctrl mount
with the functions learning about PMT feature groups called right after.
Considering the change in timing, is it guaranteed that if
intel_pmt_get_regions_by_feature() is called right after module probe that
the telemetry driver would be done with its enumeration?
>
> Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
> ---
> arch/x86/kernel/cpu/resctrl/intel_aet.c | 93 ++++++++++++++++++++++---
> 1 file changed, 83 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/intel_aet.c b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> index 743a1894fe9a..14b472106f52 100644
> --- a/arch/x86/kernel/cpu/resctrl/intel_aet.c
> +++ b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> @@ -23,6 +23,7 @@
> #include <linux/intel_vsec.h>
> #include <linux/io.h>
> #include <linux/minmax.h>
> +#include <linux/module.h>
> #include <linux/printk.h>
> #include <linux/rculist.h>
> #include <linux/rcupdate.h>
> @@ -289,6 +290,9 @@ static enum pmt_feature_id lookup_pfid(const char *pfname)
> return FEATURE_INVALID;
> }
>
> +static struct pmt_feature_group *(*get_feature)(enum pmt_feature_id id);
> +static void (*put_feature)(struct pmt_feature_group *p);
> +
> /*
> * Request a copy of struct pmt_feature_group for each event group. If there is
> * one, the returned structure has an array of telemetry_region structures,
> @@ -300,7 +304,7 @@ static enum pmt_feature_id lookup_pfid(const char *pfname)
> * struct pmt_feature_group to indicate that its events are successfully
> * enabled.
> */
> -bool intel_aet_get_events(void)
> +static bool aet_get_events(void)
> {
> struct pmt_feature_group *p;
> enum pmt_feature_id pfid;
> @@ -309,14 +313,14 @@ bool intel_aet_get_events(void)
>
> for_each_event_group(peg) {
> pfid = lookup_pfid((*peg)->pfname);
> - p = intel_pmt_get_regions_by_feature(pfid);
> + p = get_feature(pfid);
> if (IS_ERR_OR_NULL(p))
> continue;
> if (enable_events(*peg, p)) {
> (*peg)->pfg = p;
> ret = true;
> } else {
> - intel_pmt_put_feature_group(p);
> + put_feature(p);
> }
> }
>
> @@ -325,27 +329,96 @@ bool intel_aet_get_events(void)
>
> void __exit intel_aet_exit(void)
> {
> - struct event_group **peg;
> +}
>
> - for_each_event_group(peg) {
> - if ((*peg)->pfg) {
> - intel_pmt_put_feature_group((*peg)->pfg);
> - (*peg)->pfg = NULL;
> - }
> +static bool get_pmt_references(void)
> +{
> + get_feature = symbol_request(intel_pmt_get_regions_by_feature);
> + if (!get_feature)
> + return false;
> + put_feature = symbol_request(intel_pmt_put_feature_group);
> + if (!put_feature) {
> + symbol_put(intel_pmt_get_regions_by_feature);
> + get_feature = NULL;
> + return false;
> + }
> +
> + return true;
> +}
> +
> +static void put_pmt_references(void)
> +{
> + if (get_feature) {
> + symbol_put(intel_pmt_get_regions_by_feature);
> + get_feature = NULL;
> + }
> + if (put_feature) {
> + symbol_put(intel_pmt_put_feature_group);
> + put_feature = NULL;
> }
> }
>
> +static enum {
> + AET_UNINITIALIZED,
> + AET_PRESENT,
> + AET_NOT_PRESENT
> +} aet_state;
> +
> bool intel_aet_pre_mount(void)
> {
> - return false; // Temporary stub
> + bool ret;
> +
> + if (aet_state == AET_PRESENT)
> + return true;
> +
> + if (aet_state == AET_NOT_PRESENT || !get_pmt_references())
> + return false;
> +
> + ret = aet_get_events();
> +
> + if (ret) {
> + aet_state = AET_PRESENT;
> + } else {
> + aet_state = AET_NOT_PRESENT;
> + put_pmt_references();
> + }
> +
> + return ret;
> +}
I think I am missing something here. This seems to build a new state
machine on top of existing state tracking.
What if, instead, just do:
aet_get_events()
{
for_each_event_group(peg) {
if ((*peg)->pfg) /* This means "AET_PRESENT" ? */
continue;
...
}
}
Similarly, get_feature and put_feature maintains state about
the symbols.
> +
> +static void aet_cleanup(void)
> +{
> + struct event_group **peg;
> +
> + if (aet_state == AET_PRESENT) {
I do not see why this check/state machine is needed, if enumeration was not
done the loop should just do nothing because (*peg)->pfg will always be NULL.
> + for_each_event_group(peg) {
> + if ((*peg)->pfg) {
> + struct event_group *e = *peg;
> +
> + for (int j = 0; j < e->num_events; j++)
> + resctrl_disable_mon_event(e->evts[j].id);
Can this be expected to clean up/reset all changes from enable_events()? For example,
how about event_group::num_rmid and rdt_resource::resctrl_mon::num_rmid?
> + put_feature((*peg)->pfg);
> + (*peg)->pfg = NULL;
> + }
> + }
> + put_pmt_references();
Similarly here, put_pmt_references() could just always be called and it will not
do anything if there aren't references?
> + aet_state = AET_UNINITIALIZED;
> + }
> }
>
> void intel_aet_mount_result(int ret)
> {
> + struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_PERF_PKG].r_resctrl;
> +
> + if (ret) {
> + r->mon_capable = false;
> + aet_cleanup();
> + }
> }
>
> void intel_aet_unmount(void)
> {
> + aet_cleanup();
> }
I am expecting intel_aet_mount_result() and intel_aet_unmount() to look the same and that
intel_aet_mount_result() thus becomes unnecessary.
Just setting r->mon_capable to false does not seem enough though ... how about all those
monitoring domains that were created? This flow looks unexpected to me, what am I missing?
Reinette