Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem

From: Reinette Chatre

Date: Mon Apr 20 2026 - 23:24:10 EST


Hi Babu,

On 4/20/26 5:40 PM, Moger, Babu wrote:
>
> We already discussed moving back to the default group on every mode
> switch. Doing so here would once again cause extra MSR writes on
> each mode transition, which is undesirable.
>

Needing to avoid extra MSR writes in resctrl is not so absolute. Consider, for
example, how resctrl initializes default allocations when a new resource group is
created. resctrl aims to initialize with sane defaults and the user is expected to
follow with desired allocations.

I am not against optimizing, I just want to be careful with such general statements.

Considering your proposal in https://lore.kernel.org/lkml/39e0c786-cc35-4555-bfb9-ff7cd758c423@xxxxxxx/:

I do not think we should make info/kernel_mode read-only. If I understand correctly
doing so would accommodate AMD PLZA but it ignores the discussions on how resctrl could
support MPAM ... or do you perhaps have proposal on how MPAM can be supported when considering
your proposal? Even if you do not want to consider MPAM - what if the PLZA_PQR register's
scope becomes per-CPU in the next version of AMD PLZA?

The idea behind info/kernel_mode is that the active mode it identifies indicates which
configuration files exist to configure the active mode. Since the mode may not always
depend on global configuration, for which info/kernel_mode_assignment was created, but instead
rely on per-resource group files, I do not see how resctrl can build on a read-only
info/kernel_mode backed by a mode and group change via info/kernel_mode_assignment.
Specifically, MPAM support may not use info/kernel_mode_assignment at all.
Instead, MPAM may use something like described in https://lore.kernel.org/lkml/aYyxAPdTFejzsE42@xxxxxxxxxxxxxxx/

Could we perhaps consider dropping info/kernel_mode_assignment entirely for
AMD PLZA's global allocations? Similar to what you suggest, the mode and
group assignment could be done via the info/kernel_mode file instead?

Thinking about this more since the CPUs allocation is global, these could *theoretically*
be included also (but see later).
This could mean that "kernel_mode_cpus" and "kernel_mode_cpus_list" could be dropped?
Although, this may complicate the interface since user space may want a convenient way
to modify just CPUs independently from needing to repeat the mode and group every time.

Consider, for example:

# echo "global_assign_ctrl_assign_mon_per_cpu:group=ctrl1/mon1/;cpus_list=5-8" > info/kernel_mode

Having named fields (a) makes this extensible, (b) output does not need to be split among files,
and (c) "inherit_ctrl_and_mon" can continue to be supported.

The named fields could be made optional, if group is omitted then it will become the
default resource group, and if cpus/cpus_list is omitted then it will default to all CPUs.
This may not be intuitive since a user may expect that not mentioning a field means
that the field is left untouched. Have you considered this scenario in your proposal?

As an alternative the group could be made a required field and "kernel_mode_cpus"/"kernel_mode_cpuslist"
can stay? This may be the simplest approach.

Output could still use [] to indicate the active mode that includes its properties.
I find to be more intuitive interface where output more closely matches input.

Reinette