Re: [PATCH v1 06/20] x86/resctrl: Switch over to the resctrl mbps_val list

From: Reinette Chatre
Date: Fri Sep 17 2021 - 14:20:30 EST


Hi James,

On 9/17/2021 9:57 AM, James Morse wrote:
Hi Reinette,

On 01/09/2021 22:25, Reinette Chatre wrote:
On 7/29/2021 3:35 PM, James Morse wrote:
Updates to resctrl's software controller follow the same path as
other configuration updates, but they don't modify the hardware state.
rdtgroup_schemata_write() uses parse_line() and the resource's
ctrlval_parse function to stage the configuration.
resctrl_arch_update_domains() then updates the mbps_val[] array
instead, and resctrl_arch_update_domains() skips the rdt_ctrl_update()
call that would update hardware.

This complicates the interface between resctrl's filesystem parts
and architecture specific code. It should be possible for mba_sc
to be completely implemented by the filesystem parts of resctrl. This
would allow it to work on a second architecture with no additional code.

Change parse_bw() to write the configuration value directly to the
mba_sc[] array in the domain structure. Change rdtgroup_schemata_write()
to skip the call to resctrl_arch_update_domains(), meaning all the
mba_sc specific code in resctrl_arch_update_domains() can be removed.
On the read-side, show_doms() and update_mba_bw() are changed to read
the mba_sc[] array from the domain structure. With this,
resctrl_arch_get_config() no longer needs to consider mba_sc resources.

Change parse_bw() to write these values directly, meaning
rdtgroup_schemata_write() never needs to call update_domains()
for mba_sc resources.

The above paragraph seems to contain duplicate information from the paragraph that
precedes it.

Looks like two commit messages got combined. I've removed this, and the below paragraphs
as its already covered.


Get show_doms() to test is_mba_sc() and retrieve the value
directly, instead of using get_config() for the hardware value.

This means the arch code's resctrl_arch_get_config() and
resctrl_arch_update_domains() no longer need to be aware of
mba_sc, and we can get rid of the update_mba_bw() code that
reaches into the hw_dom to get the msr value.

@@ -406,6 +406,14 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
        list_for_each_entry(s, &resctrl_schema_all, list) {
          r = s->res;
+
+        /*
+         * Writes to mba_sc resources update the software controller,
+         * not the control msr.
+         */
+        if (is_mba_sc(r))
+            continue;
+

A few resources can be updated in a single write to the schemata file. It is thus possible
to update the cache allocation resource as well as memory bandwidth allocation in a single
write.

i.e. echo "L3:0=7ff;1=7ff\nMB:0=100;1=50" > schemata

I do not think something like the above would show the issue. If you want to test this via the shell you need to use ANSI-C quoting. Adjusting what you show to something like:

echo -n $'L3:0=7ff;1=7ff\nMB:0=100;1=50\n'

As I understand this change in this scenario all configuration updates will be
skipped, not just the memory bandwidth allocation ones.

The loop is per-schema, so its not a problem for L2/L3. This would only be a problem if
the is_mba_sc() resource had multiple schema. Only CDP does this, which the MBA controls
don't support.

The loop iterates through the entire buffer provided to the schemata file and the buffer could contain multiple schema. This is more typical when interacting with the schemata file with a SDK perhaps.

Reinette