Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem

From: Moger, Babu

Date: Wed Apr 08 2026 - 19:09:00 EST


Hi Reinette,

On 4/8/2026 4:24 PM, Reinette Chatre wrote:
Hi Babu,

On 4/8/26 1:45 PM, Babu Moger wrote:
On 4/7/26 23:45, Reinette Chatre wrote:
On 4/7/26 6:01 PM, Babu Moger wrote:

That said, I’m open to not having a dedicated group if we can still support all the features that PLZA provides without it.

I find that enabling user space to share CLOSID/RMID between user space
and kernel space to indeed support what PLZA provides. I think I am missing
something here since below proposal again attempts to isolate a resource group
(CLOSID) for kernel work.

No. I dont want to isolate a group just for PLZA. All I am saying
is, we should provide option to create a dedicated group if the user
wants to do it.
I agree. I do not see resctrl needing to do anything to accomplish this though. If
the user wants a group dedicated to kernel mode/PLZA then all that is needed is for the
user not to assign any tasks to this group, either via changes to the group's tasks file
or via the group's cpus/cpus_list files.


The mode can simply be determined on a per-group basis. We can
introduce two new files—kernel_mode_cpus and
kernel_mode_cpus_list—within each resctrl group when kmode (or
PLZA) is supported.

I think having these files in every resource group is confusing since user can only interact
with these files in one resource group for current PLZA. Why not *just* have the files in the
resource group that matches the group in info/kernel_mode_assignment?

The default group can also serve as the PLZA group.

#cat info/kernel_mode_assignment
//

At this point, the (kmode_cpus / kmode_cpus_list) files will exist in the default group:

Then user changes the PLZA group to "test".

#echo "test//" > info/kernel_mode_assignment

At this point, we expect the files "(kmode_cpus/kmode_cpus_list)" to be visible in "test//" group.

One open question is whether we should remove the visibility of these files from the default group. It’s unclear if we can safely do this dynamically.

An alternative approach would be to always keep the files present, but allow access to them only for groups that are listed in "info/kernel_mode_assignment".

The files appearing/disappearing is just how the user experiences the resctrl fs interface.
Within resctrl the files could indeed always exist but resctrl can use the kernfs_show()
API to show/hide them as needed. Similar to resctrl_bmec_files_show() that you created.
Allowing/removing access becomes complicated because user space can always do a chmod
to change permissions that resctrl would need to handle.

I do not know if there are sharp corners here when thinking about strange scenarios where
user opens a file before resctrl changes visibility or permissions and then user space
interacts with the file. This may be worthwhile to test to matter which mechanism is used.

Files and behavior:
- cpus / cpus_list:

CPUs listed here use the same allocation for both user and kernel space.

Both user and kernel space?

As it stands today, the CPU list is written to MSR_PQR_ASSOC, resulting in the same allocation for both user and kernel within a given CLOS.

Kernel-mode allocation changes only if specific CPUs are included in the kmode_cpus list.

ack.

There is no change to the current semantics of these files.
If these files are empty, the group effectively becomes a PLZA-dedicated group.

I do not see it this way. If the cpu/cpus_list files are empty then it means that the
tasks in the group will use their own CLOSID/RMID for user space allocation and
monitoring. What allocations/monitoring is used by tasks when in kernel mode depends
on whether the CPU the task is running on can be found in a kernel_mode_cpus/kernel_mode_cpuslist
file. If the CPU the task is running on can be found in a kernel_mode_cpus/kernel_mode_cpuslist
file then it will inherit whatever the PQR_PLZA setting of that CPU which is the allocation
associated with the resource group to which that kernel_mode_cpus/kernel_mode_cpuslist belongs.
If the CPU the task is running on cannot be found in kernel_mode_cpus/kernel_mode_cpuslist
then its kernel work will inherit its user space allocations and monitoring.


Yes. that is correct. I think our understanding is correct, but our implementation ideas are different it seems.

While we have been sharing different ideas I have tried to be clear on *why* I made
certain choices and attempted to provide specific feedback to your ideas. If you find
your plan to be better then please respond to my feedback about it to help me understand
why that may be the better solution. If you find your solution is better then could you please
describe it with detail? At this time I do not have a clear understanding of what you propose.

...

Let me make sure I understand what you mentioned earlier. Copied the text below from the thread for the context:

https://lore.kernel.org/lkml/3305c18e-9e50-4df0-b9f1-c61028628967@xxxxxxxxx/
=====================================================================

Please consider the intent of this file when thinking about names. The idea is that "info/kernel_mode"
specifies the "mode" of how kernel work is handled and it determines the configuration files used in that
mode as well as the syntax when interacting with those files. By renaming "kernel_mode_assignment" to
"kmode_groups" it implicitly requires all future kernel mode enhancements to need some data related to "groups".

In summary, I think this can be simplified by introducing just two new files in info/ that enables the
user to (a) select and (b) configure the "kernel mode". To start there can be just two modes,
global_assign_ctrl_inherit_mon_per_cpu and global_assign_ctrl_assign_mon_per_cpu.
global_assign_ctrl_inherit_mon_per_cpu mode requires a control group in kernel_mode_assignment while
global_assign_ctrl_assign_mon_per_cpu requires a control and monitoring group.

The resource group in info/kernel_mode_assignment gets two additional files "kernel_mode_cpus" and
"kernel_mode_cpus_list" that contains the CPUs enabled with the kernel mode configuration, by default
it will be all online CPUs. The resource group can continue to be used to manage allocations of and
monitor user space tasks. Specifically, the "cpus", "cpus_list", and "tasks" files remain.

A user wanting just "global" settings will get just that when writing the group to
info/kernel_mode_assignment. A user wanting "per CPU" settings can follow the
info/kernel_mode_assignment setting with changes to that resource group's kernel_mode_cpus/kernel_mode_cpus_list
files. Any task running on a CPU that is *not* in kernel_mode_cpus/kernel_mode_cpus_list can be
expected to inherit both CLOSID and RMID from user space for all kernel work.

======================================================================

Let me try to get few clarification on things here.

# cat info/kernel_mode
  [inherit_ctrl_and_mon]
  global_assign_ctrl_inherit_mon_per_cpu
  global_assign_ctrl_assign_mon_per_cpu

My understanding of "inherit_ctrl_and_mon" is that the kernel
inherits both the CLOS and the RMID from user space. Basically both
user and kernel uses same CLOSID and RMID. This reflects the current
behavior (without PLZA) correct? This would correspond to the

Correct.

default group when resctrl is mounted.


The modes "global_assign_ctrl_inherit_mon_per_cpu" and "global_assign_ctrl_assign_mon_per_cpu" represent the actual PLZA modes.

Both of these modes introduce new files kernel_mode_cpus/ and kernel_mode_cpus_list in the resctrl group.

Right. To be specific when the user changes the mode to either "global_assign_ctrl_inherit_mon_per_cpu" or
"global_assign_ctrl_assign_mon_per_cpu" the new files will be created in the default resource group with
associated setting applied globally at that time.

If, at that point, "info/kernel_mode_assignment" points to // (the default group), is that correct?

And if "info/kernel_mode_assignment" points to a different group (for example, test//), then the kernel_mode_cpus/ and kernel_mode_cpus_list files will be created only under the test// group. Is that correct?

Thanks
Babu