Re: [RFC PATCH 13/19] x86/resctrl: Add PLZA state tracking and context switch handling
From: Reinette Chatre
Date: Mon Feb 23 2026 - 11:55:39 EST
Hi Ben,
On 2/23/26 2:08 AM, Ben Horgan wrote:
> On 2/20/26 02:53, Reinette Chatre wrote:
...
>> Dedicated global allocations for kernel work, monitoring same for user space and kernel (MPAM)
>> ----------------------------------------------------------------------------------------------
>> 1. User space creates resource and monitoring groups for user tasks:
>> /sys/fs/resctrl <= User space default allocations
>> /sys/fs/resctrl/g1 <= User space allocations g1
>> /sys/fs/resctrl/g1/mon_groups/g1m1 <= User space monitoring group g1m1
>> /sys/fs/resctrl/g1/mon_groups/g1m2 <= User space monitoring group g1m2
>> /sys/fs/resctrl/g2 <= User space allocations g2
>> /sys/fs/resctrl/g2/mon_groups/g2m1 <= User space monitoring group g2m1
>> /sys/fs/resctrl/g2/mon_groups/g2m2 <= User space monitoring group g2m2
>>
>> 2. User space creates resource and monitoring groups for kernel work (system has two PMG):
>> /sys/fs/resctrl/kernel <= Kernel space allocations
>> /sys/fs/resctrl/kernel/mon_data <= Kernel space monitoring for all of default and g1
>> /sys/fs/resctrl/kernel/mon_groups/kernel_g2 <= Kernel space monitoring for all of g2
>> 3. Set kernel mode to per_group_assign_ctrl_assign_mon:
>> # echo per_group_assign_ctrl_assign_mon > info/kernel_mode
>> - info/kernel_mode_assignment becomes visible and contains
>> # cat info/kernel_mode_assignment
>> //://
>> g1//://
>> g1/g1m1/://
>> g1/g1m2/://
>> g2//://
>> g2/g2m1/://
>> g2/g2m2/://
>> - An optimization here may be to have the change to per_group_assign_ctrl_assign_mon mode be implemented
>> similar to the change to global_assign_ctrl_assign_mon that initializes a global default. This can
>> avoid keeping tasklist_lock for a long time to set all tasks' kernel CLOSID/RMID to default just for
>> user space to likely change it.
>> 4. Set groups to be used for kernel work:
>> # echo '//:kernel//\ng1//:kernel//\ng1/g1m1/:kernel//\ng1/g1m2/:kernel//\ng2//:kernel/kernel_g2/\ng2/g2m1/:kernel/kernel_g2/\ng2/g2m2/:kernel/kernel_g2/\n' > info/kernel_mode_assignment
>
> Am I right in thinking that you want this in the info directory to avoid
> adding files to the CTRL_MON/MON groups?
I see this file as providing the same capability as you suggested in
https://lore.kernel.org/lkml/aYyxAPdTFejzsE42@xxxxxxxxxxxxxxx/. The reason why I
presented this as a single file is not because I am trying to avoid adding
files to the CTRL_MON/MON groups but because I believe such interface enables
resctrl to have more flexibility and support more scenarios for optimization.
As you mentioned in your proposal the solution enables a single write to move
a task. As I thought through what resctrl needs to do on such write I saw a lot
of similarities with mongrp_reparent() that loops through all the tasks via
for_each_process_thread() while holding tasklist_lock. Issues with mongrp_reparent()
holding tasklist_lock for a long time are described in [1].
While the single file does not avoid taking tasklist_lock it does give the user the
ability to set kernel group for multiple user groups with a single write. When user space
does so I believe it is possible for resctrl to have an optimization that takes tasklist_lock
just once and make changes to tasks belonging to all groups while looping through all tasks on
system just once. With files within the CTRL_MON/MON groups setting kernel group for
multiple user groups will require multiple writes from user space where each write requires
looping through tasks while holding tasklist_lock during each loop. From what I learned
from [1] something like this can be very disruptive to the rest of the system.
In summary, I see having this single file provide the same capability as the
on-file-per-CTRL_MON/MON group since user can choose to set kernel group for user
group one at a time but it also gives more flexibility to resctrl for optimization.
Nothing is set in stone here. There is still flexibility in this proposal to support
PARTID and PMG assignment with a single file in each CTRL_MON/MON group if we find that
it has the more benefits. resctrl can still expose a "per_group_assign_ctrl_assign_mon" mode
but instead of making "info/kernel_mode_assignment" visible when it is enabled the control file
in CTRL_MON/MON groups are made visible ... even in this case resctrl could still add the single
file later if deemed necessary at that time.
Considering all this, do you think resctrl should rather start with a file in each
CTRL_MON/MON group?
Reinette
[1] https://lore.kernel.org/lkml/CALPaoCh0SbG1+VbbgcxjubE7Cc2Pb6QqhG3NH6X=WwsNfqNjtA@xxxxxxxxxxxxxx/