Re: [PATCH v20 14/18] x86/resctrl: Fill out rmid_read structure for smp_call*() to read a counter

From: Reinette Chatre
Date: Thu Jun 20 2024 - 17:31:35 EST


Hi Tony,

On 6/10/24 11:35 AM, Tony Luck wrote:
mon_event_read() fills out most fields of the struct rmid_read that is
passed via an smp_call*() function to a CPU that is part of the correct
domain to read the monitor counters.

With Sub-NUMA Cluster (SNC) mode there are now two cases to handle:

1) Reading a file that returns a value for a single domain.
+ Choose the CPU to execute from the domain cpu_mask

2) Reading a file that must sum across domains sharing an L3 cache
instance.
+ Indicate to called code that a sum is needed by passing a NULL
rdt_mon_domain pointer.
+ Choose the CPU from the L3 shared_cpu_map.

Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>

The kerneldoc in patch #9 introduces new requirements related to
struct rmid_read wrt how NULL values are interpreted and
used. This makes it essential that struct rmid_read is always initialized
correctly and should no longer consist of whatever is on the stack. I
mentioned in response to v19 that static checkers found issues here.
I understand that mbm_update() always sets the domain in
struct rmid_read, but I do not find it acceptable that it
passes garbage as the cacheinfo pointer based on subtle assumptions on
when/how __mon_event_count() uses this field.

---
arch/x86/kernel/cpu/resctrl/internal.h | 2 +-
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 39 ++++++++++++++++++-----
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
3 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 77da29ced7eb..75bb1afc4842 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -627,7 +627,7 @@ void mon_event_count(void *info);
int rdtgroup_mondata_show(struct seq_file *m, void *arg);
void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
- int evtid, int first);
+ cpumask_t *cpumask, int evtid, int first);
void mbm_setup_overflow_handler(struct rdt_mon_domain *dom,
unsigned long delay_ms,
int exclude_cpu);
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 3b9383612c35..5a43931fd423 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -515,7 +515,7 @@ static int smp_mon_event_count(void *arg)
void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
- int evtid, int first)
+ cpumask_t *cpumask, int evtid, int first)
{
int cpu;
@@ -537,7 +537,7 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
return;
}
- cpu = cpumask_any_housekeeping(&d->hdr.cpu_mask, RESCTRL_PICK_ANY_CPU);
+ cpu = cpumask_any_housekeeping(cpumask, RESCTRL_PICK_ANY_CPU);
/*
* cpumask_any_housekeeping() prefers housekeeping CPUs, but
@@ -546,7 +546,7 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
* counters on some platforms if its called in IRQ context.
*/
if (tick_nohz_full_cpu(cpu))
- smp_call_function_any(&d->hdr.cpu_mask, mon_event_count, rr, 1);
+ smp_call_function_any(cpumask, mon_event_count, rr, 1);
else
smp_call_on_cpu(cpu, smp_mon_event_count, rr, false);
@@ -575,16 +575,39 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
resid = md.u.rid;
domid = md.u.domid;
evtid = md.u.evtid;
-
r = &rdt_resources_all[resid].r_resctrl;
- hdr = rdt_find_domain(&r->mon_domains, domid, NULL);
- if (!hdr || WARN_ON_ONCE(hdr->type != RESCTRL_MON_DOMAIN)) {
+
+ if (md.u.sum) {
+ /*
+ * This file requires summing across all SNC domains that share
+ * the L3 cache id that was provided in the "domid" field of the
+ * mon_data_bits union. Search all domains in the resource for
+ * one that matches this cache id.
+ */
+ list_for_each_entry(d, &r->mon_domains, hdr.list) {
+ if (d->ci->id == domid) {
+ rr.ci = d->ci;
+ mon_event_read(&rr, r, NULL, rdtgrp, &d->ci->shared_cpu_map, evtid, false);

Please split this line, it is over 100

Reinette