Re: [PATCH v19 16/20] x86/resctrl: Make resctrl_arch_rmid_read() handle sum over domains

From: Reinette Chatre
Date: Fri Jun 07 2024 - 17:08:56 EST


Hi Tony,

On 6/7/24 12:51 PM, Luck, Tony wrote:
Hi Tony,

On 6/3/24 4:15 PM, Tony Luck wrote:
On Thu, May 30, 2024 at 01:24:57PM -0700, Reinette Chatre wrote:
On 5/28/24 3:20 PM, Tony Luck wrote:
...

+
+ return 0;
+}
+
+int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
+ u32 unused, u32 rmid, enum resctrl_event_id eventid,
+ u64 *val, bool sum, struct cacheinfo *ci, void *ignored)

This is not architecture specific code.

Can you explain further? I've dropped the "sum" argument. As you pointed
out elsewhere this can be inferred from "d == NULL". But I do need the
cacheinfo information in resctrl_arch_rmid_read() to:
1) determine which domains to sum (those that match ci->id).
2) sanity check code is executing on a CPU in ci->shared_cpu_map.


"resctrl_arch_*" is the prefix of functions needed to be implemented
by every architecture. As I understand there is nothing architecture
specific about what this function does and every architecture's function
would thus end up looking identical. I expected the cacheinfo
information to be available from all architectures. If this is not
the case then it does not belong in struct rdt_mon_domain but should
instead be moved to struct rdt_hw_mon_domain ... but since cacheinfo
has already made its way into the filesystem code it is not clear
to me how you envision the arch/fs split.

Hi Reinette,

Files in resctrl that sum over resources are going to be a necessary feature

Sum over resources? That is something entirely different from SNC, no?

for backwards compatibility. I'm doing it for the first time here for SNC, but
I know of another platform topology change on the horizon that could also
benefit from this.

Looking at the end-point of the James Morse/Dave Martin patch series to
split out the arch independent layer to fs/resctrl/ I see that fs/monitor.c
makes calls to resctrl_arch_rmid_read(). The x86 version of this remains
in arch/x86/kernel/cpu/resctrl/monitor.c (I don't see an MPAM version).

James already added two arguments that MPAM needs and x86 doesn't
(hence "u32 unused" and "void *ignored" in the argument list). I confess
that my thought had been "If he can pad out the argument list for MPAM,
then I can do it too for x86". But that leads to madness, so time to reconsider.

I can see a couple of paths.

1) MPAM/others will also want to have files that sum things, so maybe they want
an extra argument that shows what to add up. Though even if they do, their
requirement may not be met by a "cacheinfo" pointer.

2) Only x86 (Intel) will use this. Maybe in this case the answer is to do some
renaming so the "void *unused" argument can be used to pass architecture
specific information.

By creating the new "sub" monitor files the sum of data has become a feature
of resctrl fs.

By including a pointer to struct cacheinfo in struct rdt_mon_domain as well as
struct rmid_read it surely is not just for Intel. If you want to make this just for
Intel then the whole solution needs to be changed.


Sketch for option #2. Currently the code does:
---------------------

That void *argument is currently supplied by a call. E.g.

void *arch_mon_ctx;

arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, QOS_L3_OCCUP_EVENT_ID);


resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid,
QOS_L3_OCCUP_EVENT_ID, &val,
arch_mon_ctx);

resctrl_arch_mon_ctx_free(r, QOS_L3_OCCUP_EVENT_ID, arch_mon_ctx);

The x86 version of resctrl_arch_mon_ctx_alloc() just does "might_sleep(); return NULL;"
and resctrl_arch_mon_ctx_free() does nothing.

New version makes these changes:
---------------------

Add rdt_mon_domain * as a new argument to resctrl_arch_mon_ctx_alloc() (which MPAM can
ignore).

x86 alloc function becomes:

void *resctrl_arch_mon_ctx_alloc(struct rdt_resource *r, struct rdt_mon_domain *d, int evtid)
{
might_sleep();

return d->ci;
}

resctrl_arch_mon_ctx_free() remains an empty stub.


Is this a reasonable way to split the independent fs layer code from the architecture specific?

Why is this necessary at all? The new variant of resctrl_arch_rmid_read() introduced in this
patch does not contain anything that is architecture specific and thus it is filesystem code.
It is this code that uses the information in struct cacheinfo to set the correct domain, if
needed. In this patch you rewrite resctrl_arch_rmid_read() as something architecture specific
but I cannot see what is architecture specific about it at all. Why not just call this new
function resctrl_rmid_read() that stays in filesystem code and then what you have in this
patch as resctrl_arch_rmid_read_one() should be what is already known as the architecture
specific resctrl_arch_rmid_read(). It is the architecture specific RMID read function that
does not need struct cacheinfo.

Reinette