Re: [PATCH v19 14/20] x86/resctrl: Fill out rmid_read structure for smp_call*() to read a counter

From: Reinette Chatre
Date: Thu May 30 2024 - 16:25:59 EST


Hi Tony,

On 5/28/24 3:19 PM, Tony Luck wrote:
mon_event_read() fills out most fields of the struct rmid_read that is
passed via an smp_call*() function to a CPU that is part of the correct
domain to read the monitor counters.

The one exception is the sumdomains field that is set by the caller
(either rdtgroup_mondata_show() or mon_add_all_files()).

When rmid_read.sumdomains is false:
The domain field "d" specifies the only domain to read
CPU to execute is chosen from d->hdr.cpu_mask

When rmid_read.sumdomains is true:
The domain field is NULL.
The cache_info field indicates that all domains
that are part of that cache instance should be
summed.
CPU to execute is chosen from ci->shared_cpu_mask

Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 28 ++++++++++++++++++-----
1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 3b9383612c35..4e394400e575 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -517,6 +517,7 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
int evtid, int first)
{
+ cpumask_t *cpumask;
int cpu;
/* When picking a CPU from cpu_mask, ensure it can't race with cpuhp */
@@ -537,7 +538,8 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
return;
}
- cpu = cpumask_any_housekeeping(&d->hdr.cpu_mask, RESCTRL_PICK_ANY_CPU);
+ cpumask = rr->sumdomains ? &rr->ci->shared_cpu_map : &d->hdr.cpu_mask;
+ cpu = cpumask_any_housekeeping(cpumask, RESCTRL_PICK_ANY_CPU);
/*
* cpumask_any_housekeeping() prefers housekeeping CPUs, but
@@ -546,7 +548,7 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
* counters on some platforms if its called in IRQ context.
*/
if (tick_nohz_full_cpu(cpu))
- smp_call_function_any(&d->hdr.cpu_mask, mon_event_count, rr, 1);
+ smp_call_function_any(cpumask, mon_event_count, rr, 1);
else
smp_call_on_cpu(cpu, smp_mon_event_count, rr, false);

Why not provide the cpumask as parameter to mon_event_read() as I asked
about in previous version (again feedback that was totally ignored)? With
this implementation there is a portion of SNC logic in rdtgroup_mondata_show()
and another portion of logic in mon_event_read(). Scattering SNC quirk logic like
this makes this very hard to understand.

To help with understanding this code I asked _twice_ ([1] and [2]) to
_at_least_ provide comments for these SNC branches but even a request for comments
is totally ignored. I even provided some comments based on my understanding
that could have just been copied&pasted (if it was correct), but that was ignored
also. I understand this work has been taking a while and to support this I am
trying to spend more time to review and provide more detailed feedback but to
find it just to be ignored over and over is extremely frustrating and wasting
so much of my time. I do not expect that you do everything as I propose but if
I propose something silly then please point it out so that I can learn? At this
point I am convinced that you find my feedback not worth responding
to leaving us stuck with me who keep trying to review your work and getting
ignored over and over in every new version.

What should I do Tony? Why should I keep reviewing this work? Asking to address
review feedback should not be necessary yet I seem to keep doing it. An attempt
at an ultimatum was futile since it was just dodged [3] with burden placed right
back on me.

@@ -575,15 +577,29 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
resid = md.u.rid;
domid = md.u.domid;
evtid = md.u.evtid;
-
+ rr.sumdomains = md.u.sum;
r = &rdt_resources_all[resid].r_resctrl;
- hdr = rdt_find_domain(&r->mon_domains, domid, NULL);
- if (!hdr || WARN_ON_ONCE(hdr->type != RESCTRL_MON_DOMAIN)) {
+
+ if (rr.sumdomains) {

/* Explain what this does and why */

+ list_for_each_entry(d, &r->mon_domains, hdr.list) {
+ if (d->ci->id == domid) {
+ rr.ci = d->ci;
+ d = NULL;
+ goto got_cacheinfo;
+ }
+ }
ret = -ENOENT;
goto out;
+ } else {
+ hdr = rdt_find_domain(&r->mon_domains, domid, NULL);
+ if (!hdr || WARN_ON_ONCE(hdr->type != RESCTRL_MON_DOMAIN)) {
+ ret = -ENOENT;
+ goto out;
+ }
+ d = container_of(hdr, struct rdt_mon_domain, hdr);
}
- d = container_of(hdr, struct rdt_mon_domain, hdr);
+got_cacheinfo:

Something like "read_event" would be more appropriate since "got_cacheinfo"
has no relevance to non-SNC.

mon_event_read(&rr, r, d, rdtgrp, evtid, false);
if (rr.err == -EIO)

Reinette

[1] https://lore.kernel.org/lkml/2a7cc72a-99fb-4862-b7eb-da3d515f0453@xxxxxxxxx/
[2] https://lore.kernel.org/lkml/ccd5df99-4d7e-4224-a07e-3897e370b53e@xxxxxxxxx/
[3] https://lore.kernel.org/lkml/41cc8a35-81fb-b890-f963-8dc9524a54b0@xxxxxxxxx/