Re: [PATCH v2 00/16] fs,x86/resctrl: Add kernel-mode (e.g., PLZA) support to the resctrl subsystem
From: Babu Moger
Date: Thu Mar 26 2026 - 13:16:13 EST
Hi Reinette,
Thanks for the review comments. Will address one by one.
On 3/24/26 17:51, Reinette Chatre wrote:
Hi Babu,Sure. Will try. Lets continue the discussion.
On 3/12/26 1:36 PM, Babu Moger wrote:
This series adds support for Privilege-Level Zero Association (PLZA) to theOur discussion considered how resctrl could support PLZA in a generic way while
resctrl subsystem. PLZA is an AMD feature that allows specifying a CLOSID
and/or RMID for execution in kernel mode (privilege level zero), so that
kernel work is not subject to the same resource constrains as the current
user-space task. This avoids kernel operations being aggressively throttled
when a task's memory bandwidth is heavily limited.
The feature documentation is not yet publicly available, but it is expected
to be released in the next few weeks. In the meantime, a brief description
of the features is provided below.
Privilege Level Zero Association (PLZA)
Privilege Level Zero Association (PLZA) allows the hardware to
automatically associate execution in Privilege Level Zero (CPL=0) with a
specific COS (Class of Service) and/or RMID (Resource Monitoring
Identifier). The QoS feature set already has a mechanism to associate
execution on each logical processor with an RMID or COS. PLZA allows the
system to override this per-thread association for a thread that is
executing with CPL=0.
------------------------------------------------------------------------
The series introduces the feature in a way that supports the interface in
a generic manner to accomodate MPAM or other vendor specific implimentation.
Below is the detailed requirements provided by Reinette:
https://lore.kernel.org/lkml/2ab556af-095b-422b-9396-f845c6fd0342@xxxxxxxxx/
also preparing to support MPAM's variants and how PLZA may evolve to have similar
capabilities when considering the capabilities of its registers.
This does not mean that your work needs to implement everything that was discussed.
Instead, this work is expected to just support what PLZA is capable of today but
do so in a way that the future enhancements could be added to.
This series is quite difficult to follow since it appears to implement a full
featured generic interface while PLZA cannot take advantage of it.
Could you please simplify this work to focus on just enabling PLZA and only
add interfaces needed to do so?
Summary:To help with future usages please connect visibility of this file with the mode in
1. Kernel-mode/PLZA controls and status should be exposed under the resctrl
info directory:/sys/fs/resctrl/info/, not as a separate or arch-specific path.
2. Add two info files
a. kernel_mode
Purpose: Control how resource allocation and monitoring apply in kernel mode
(e.g. inherit from task vs global assign).
Read: List supported modes and show current one (e.g. with [brackets]).
Write: Set current mode by name (e.g. inherit_ctrl_and_mon, global_assign_ctrl_assign_mon).
b. kernel_mode_assignment
Purpose: When a “global assign” kernel mode is active, specify which resctrl group
(CLOSID/RMID) is used for kernel work.
Read: Show the assigned group in a path-like form (e.g. //, ctrl1//, ctrl1/mon1/).
Write: Assign or clear the group used for kernel mode (and optionally clear with an empty write).
The patches are based on top of commit (v7.0.0-rc3)
839e91ce3f41b (tip/master) Merge branch into tip/master: 'x86/tdx'
------------------------------------------------------------------------
Examples: kernel_mode and kernel_mode_assignment
All paths below are under /sys/fs/resctrl/ (e.g. info/kernel_mode means
/sys/fs/resctrl/info/kernel_mode). Resctrl must be mounted and the platform
must support the relevant modes (e.g. AMD with PLZA).
1) kernel_mode — show and set the current kernel mode
Read supported modes and which one is active (current in brackets):
$ cat info/kernel_mode
[inherit_ctrl_and_mon]
global_assign_ctrl_inherit_mon
global_assign_ctrl_assign_mon
Set the active mode (e.g. use one CLOSID+RMID for all kernel work):
$ echo "global_assign_ctrl_assign_mon" > info/kernel_mode
$ cat info/kernel_mode
inherit_ctrl_and_mon
global_assign_ctrl_inherit_mon
[global_assign_ctrl_assign_mon]
Mode meanings:
- inherit_ctrl_and_mon: kernel uses same CLOSID/RMID as the current task (default).
- global_assign_ctrl_inherit_mon: one CLOSID for all kernel work; RMID inherited from user.
- global_assign_ctrl_assign_mon: one resource group (CLOSID+RMID) for all kernel work.
2) kernel_mode_assignment — show and set which group is used for kernel work
Only relevant when kernel_mode is not "inherit_ctrl_and_mon". Read the
info/kernel_mode. This helps us to support future modes with other resctrl files, possible
within each resource group.
Specifically, kernel_mode_assignment is not visible to user space if mode is "inherit_ctrl_and_mon",
while it is visible when mode is global_assign_ctrl_inherit_mon or global_assign_ctrl_assign_mon.
Sure. Will do.
currently assigned group (path format is "CTRL_MON/MON/"):The format depends on the mode, right? If the mode is "global_assign_ctrl_inherit_mon"
then it should only contain a control group, alternatively, if the mode is
"global_assign_ctrl_assign_mon" then it contains control and mon group. This gives
resctrl future flexibility to change format for future modes.
This can be done both ways. Whole purpose of these groups is to get CLOSID and RMID to enable PLZA. User can echo CTRL_MON or MON group to kernel_mode_assignment in any of the modes. We can decide what needs to be updated in MSR (PQR_PLZA_ASSOC) based on what kernel mode is selected.
We should also consider the scenario when it is a "monitoring only" system, which can
happen independent from what hardware actually supports, for example, if user boots
with "rdt=!l3cat,!l2cat,!mba,!smba". In this case I assume CLOS should just always be
zero and thus only "default control group" is accepted?
Yes. It depends on how we want to implement like we mentioned above.
Yes. We can do that.
$ cat info/kernel_mode_assignmentThis does not look right. Would this not create a conflict between info/kernel_mode
//
"//" means the default CTRL_MON group is assigned. Assign a specific
group instead (e.g. a CTRL_MON group "ctrl1", or a MON group "mon1" under it):
$ echo "ctrl1//" > info/kernel_mode_assignment
$ cat info/kernel_mode_assignment
ctrl1//
$ echo "ctrl1/mon1/" > info/kernel_mode_assignment
$ cat info/kernel_mode_assignment
ctrl1/mon1/
Clear the assignment (no dedicated group for kernel work):
$ echo >> info/kernel_mode_assignment
$ cat info/kernel_mode_assignment
Kmode is not configured
and info/kernel_mode_assignment about what the current mode is? The way I see it
info/kernel_mode_assignment must always contain a valid group.
Errors (e.g. invalid group name or unsupported mode) are reported inI do not see why the context switch path needs to be touched at all with this
info/last_cmd_status.
---
v2:
This is similar to RFC with new proposal. Names of the some interfaces
are not final. Lets fix that later as we move forward.
Separated the two features: Global Bandwidth Enforcement (GLBE) and
Privilege Level Zero Association (PLZA).
This series only adds support for PLZA.
Used the name of the feature as kmode instead of PLZA. That can be changed as well.
Tony suggested using global variables to store the kernel mode
CLOSID and RMID. However, the kernel mode CLOSID and RMID are
coming from rdtgroup structure with the new interface. Accessing
them requires holding the associated lock, which would make the
context switch path unnecessarily expensive. So, dropped the idea.
https://lore.kernel.org/lkml/aXuxVSbk1GR2ttzF@agluck-desk3/
Let me know if there are other ways to optimize this.
implementation. Since PLZA only supports global assignment does it not mean that resctrl
only needs to update PQR_PLZA_ASSOC when user writes to info/kernel_mode and
info/kernel_mode_assignment?
Each thread has an MSR to configure whether to associate privilege level zero execution with a separate COS and/or RMID, and the value of the COS and/or RMID. PLZA may be enabled or disabled on a per-thread basis. However, the COS and RMID association and configuration must be the same for all threads in the QOS Domain.
So, PQR_PLZA_ASSOC is a per thread MSR just like PQR_ASSOC.
Privilege-Level Zero Association (PLZA) allows the user to specify a COS and/or RMID associated with execution in Privilege-Level Zero. When enabled on a HW thread, when that thread enters Privilige-Level Zero, transactions associated with that thread will be associated with the PLZA COS and/or RMID. Otherwise, the HW thread will be associated with the COS and RMID identified by PQR_ASSOC.
More below.
Consider some of the scenarios:
resctrl mount with default state:
# cat info/kernel_mode
[inherit_ctrl_and_mon]
global_assign_ctrl_inherit_mon
global_assign_ctrl_assign_mon
# ls info/kernel_mode_assignment
ls: cannot access 'info/kernel_mode_assignment': No such file or directory
enable global_assign_ctrl_assign_mon mode:
# echo "global_assign_ctrl_assign_mon" > info/kernel_mode
Expectation here is that when user space sets this mode as above then resctrl would
in turn program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
MSR_IA32_PQR_PLZA_ASSOC.rmid=0
MSR_IA32_PQR_PLZA_ASSOC.rmid_en=1
MSR_IA32_PQR_PLZA_ASSOC.closid=0
MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
MSR_IA32_PQR_PLZA_ASSOC.plza_en=1
I do not see why it is necessary to maintain any per-CPU or per-task state or needing
to touch the context switch code. Since PLZA only supports global could it not
just set MSR_IA32_PQR_PLZA_ASSOC on all online CPUs and be done with it?
Only caveat is that if a CPU is offline then this setting needs to be stashed
so that MSR_IA32_PQR_PLZA_ASSOC can be set when new CPU comes online.
The way that rdtgroup_config_kmode() introduced in patch #11 assumes it is dealing
with RDT_RESOURCE_L3 and traverses the resource domain list and resource group
CPU mask seems unnecessary to me as well as error prone since the system may only
have, for example, RDT_RESOURCE_MBA enabled or even just monitoring. Why not just set
MSR_IA32_PQR_PLZA_ASSOC on all CPUs and be done?
To continue the scenarios ...
After user's setting above related files read:
# cat info/kernel_mode
inherit_ctrl_and_mon
global_assign_ctrl_inherit_mon
[global_assign_ctrl_assign_mon]
# cat info/kernel_mode_assignment
//
Modify group used by global_assign_ctrl_assign_mon mode:
# echo 'ctrl1/mon1/' > info/kernel_mode_assignment
Expectation here is that when user space sets this then resctrl would
program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
MSR_IA32_PQR_PLZA_ASSOC.rmid=<rmid of mon1>
MSR_IA32_PQR_PLZA_ASSOC.rmid_en=1
MSR_IA32_PQR_PLZA_ASSOC.closid=<closid of ctrl1>
MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
MSR_IA32_PQR_PLZA_ASSOC.plza_en=1
This works correctly when PLZA associations are defined by per CPU. For example, lets assume that *ctrl1* is assigned *CLOSID 1*.
In this scenario, every task in the system running on a any CPU will use the limits associated with *CLOSID 1* whenever it enters Privilege-Level Zero, because the CPU's *PQR_PLZA_ASSOC* register has PLZA enabled and CLOSID is 1.
Now consider task-based association:
We have two resctrl groups:
* *ctrl1 -> CLOSID 1 -> task1.plza = 1 : *User wants PLZA be enabled
for this task.
* *ctrl2 -> CLOSID 2 -> task2.plza = 0 : *User wants PLZA
disabled for this task.
Suppose *task1* is first scheduled on *CPU 0*. This behaves as expected: since CPU 0 's *PQR_PLZA_ASSOC* contains *CLOSID 1, plza_en =1*, task1 will use the limits from CLOSID 1 when it enters Privilege-Level Zero.
However, if *task2* later runs on *CPU 0*, we expect it to use *CLOSID 2* in both user mode and kernel mode, because user has PLZA disabled for this task. But CPU 0 still has *CLOSID 1, **plza_en =1* in its PQR_PLZA_ASSOC register.
As a result, task2 will incorrectly run with *CLOSID 1* when entering Privilege-Level Zero something we explicitly want to avoid.
At that point, PLZA must be disabled on CPU 0 to prevent the unintended association. Hope this explanation makes the issue clear.
Thanks
Babu
Enable global_assign_ctrl_inherit_mon mode:
# echo "global_assign_ctrl_inherit_mon" > info/kernel_mode
Expectation here is that when user space sets this mode then resctrl would
program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
MSR_IA32_PQR_PLZA_ASSOC.rmid=0
MSR_IA32_PQR_PLZA_ASSOC.rmid_en=0
MSR_IA32_PQR_PLZA_ASSOC.closid=0
MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
MSR_IA32_PQR_PLZA_ASSOC.plza_en=1
# cat info/kernel_mode
inherit_ctrl_and_mon
[global_assign_ctrl_inherit_mon]
global_assign_ctrl_assign_mon
# cat info/kernel_mode_assignment <==== returns just a ctrl group
/
Modify group used by global_assign_ctrl_inherit_mon mode:
# echo ctrl1 > info/kernel_mode_assignment
Expectation here is that when user space sets this then resctrl would
program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
MSR_IA32_PQR_PLZA_ASSOC.rmid=0
MSR_IA32_PQR_PLZA_ASSOC.rmid_en=0
MSR_IA32_PQR_PLZA_ASSOC.closid=<closid of ctrl1>
MSR_IA32_PQR_PLZA_ASSOC.closid_en=1
MSR_IA32_PQR_PLZA_ASSOC.plza_en=1
# cat info/kernel_mode_assignment <==== returns just a ctrl group
ctrl/
Enable inherit_ctrl_and_mon mode:
# echo "inherit_ctrl_and_mon" > info/kernel_mode
Expectation here is that when user space sets this mode then resctrl would
program MSR_IA32_PQR_PLZA_ASSOC on all CPUs to be:
MSR_IA32_PQR_PLZA_ASSOC.rmid=0
MSR_IA32_PQR_PLZA_ASSOC.rmid_en=0
MSR_IA32_PQR_PLZA_ASSOC.closid=0
MSR_IA32_PQR_PLZA_ASSOC.closid_en=0
MSR_IA32_PQR_PLZA_ASSOC.plza_en=0
At this point info/kernel_mode_assignment is not visible anymore:
# ls info/kernel_mode_assignment
ls: cannot access 'info/kernel_mode_assignment': No such file or directory
>From what I understand above exposes and enables full capability of PLZA. All the other
per-task and per-cpu handling in this series is not something that PLZA can benefit from.
If this is not the case, what am I missing? Could this series be simplified to just support
PLZA today? When next hardware with more capability needs to be supported resctrl could be
enhanced to support it by using the more accurate information about what the hardware is
capable of.
We also do not really know what use cases users prefer. This may even be sufficient.
Reinette