Re: [PATCH v13 11/32] x86,fs/resctrl: Handle events that can be read from any CPU
From: Chen, Yu C
Date: Thu Oct 30 2025 - 12:18:30 EST
On 10/30/2025 11:54 PM, Luck, Tony wrote:
On Thu, Oct 30, 2025 at 02:14:27PM +0800, Chen, Yu C wrote:
Hi Tony,
On 10/30/2025 12:20 AM, Tony Luck wrote:
resctrl assumes that monitor events can only be read from a CPU in the
cpumask_t set of each domain. This is true for x86 events accessed
with an MSR interface, but may not be true for other access methods such
as MMIO.
Introduce and use flag mon_evt::any_cpu, settable by architecture, that
indicates there are no restrictions on which CPU can read that event.
Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
[snip]
-void resctrl_enable_mon_event(enum resctrl_event_id eventid)
+void resctrl_enable_mon_event(enum resctrl_event_id eventid, bool any_cpu)
{
if (WARN_ON_ONCE(eventid < QOS_FIRST_EVENT || eventid >= QOS_NUM_EVENTS))
return;
@@ -984,6 +984,7 @@ void resctrl_enable_mon_event(enum resctrl_event_id eventid)
return;
}
+ mon_event_all[eventid].any_cpu = any_cpu;
mon_event_all[eventid].enabled = true;
}
It seems that cpu_on_correct_domain() was dropped, due to
the refactor of __mon_event_count() in patch 0006 means it is no
longer needed. But we still invoke smp_processor_id() in preemptible
context in __l3_mon_event_count() before further checkings, which would
cause a warning.
[ 4266.361951] BUG: using smp_processor_id() in preemptible [00000000] code:
grep/1603
[ 4266.363231] caller is __l3_mon_event_count+0x30/0x2a0
[ 4266.364250] Call Trace:
[ 4266.364262] <TASK>
[ 4266.364273] dump_stack_lvl+0x53/0x70
[ 4266.364289] check_preemption_disabled+0xca/0xe0
[ 4266.364303] __l3_mon_event_count+0x30/0x2a0
[ 4266.364320] mon_event_count+0x22/0x90
[ 4266.364334] rdtgroup_mondata_show+0x108/0x390
[ 4266.364353] seq_read_iter+0x10d/0x450
[ 4266.364368] vfs_read+0x215/0x330
[ 4266.364386] ksys_read+0x6b/0xe0
[ 4266.364401] do_syscall_64+0x57/0xd70
I didn't notice this in my testing. Is this in your region aware
tree? If you are still using RDT_RESOURCE_L3 then I can see how
you got this call trace.
Yes, it was tested on the region aware tree.
Maybe you need to dig cpu_on_correct_domain() back up and apply
it to __l3_mon_event_count()?
Got it, will do.
Thanks,
Chenyu