Re: [RFC PATCH 13/19] x86/resctrl: Add PLZA state tracking and context switch handling

From: Ben Horgan

Date: Mon Feb 23 2026 - 05:12:13 EST


Hi Reinette,

On 2/20/26 02:53, Reinette Chatre wrote:
> Hi Tony, Ben, Babu, and Stephane,
>
> On 2/18/26 8:44 AM, Luck, Tony wrote:
>> On Tue, Feb 17, 2026 at 03:55:44PM -0800, Reinette Chatre wrote:
>>> Hi Tony,
>>>
>>> On 2/17/26 2:52 PM, Luck, Tony wrote:
>>>> On Tue, Feb 17, 2026 at 02:37:49PM -0800, Reinette Chatre wrote:
>>>>> Hi Tony,
>>>>>
>>>>> On 2/17/26 1:44 PM, Luck, Tony wrote:
>>>>>>>>>> I'm not sure if this would happen in the real world or not.
>>>>>>>>>
>>>>>>>>> Ack. I would like to echo Tony's request for feedback from resctrl users
>>>>>>>>> https://lore.kernel.org/lkml/aYzcpuG0PfUaTdqt@agluck-desk3/
>>>>>>>>
>>>>>>>> Indeed. This is all getting a bit complicated.
>>>>>>>>
>>>>>>>
>>>>>>> ack
>>>>>>
>>>>>> We have several proposals so far:
>>>>>>
>>>>>> 1) Ben's suggestion to use the default group (either with a Babu-style
>>>>>> "plza" file just in that group, or a configuration file under "info/").
>>>>>>
>>>>>> This is easily the simplest for implementation, but has no flexibility.
>>>>>> Also requires users to move all the non-critical workloads out to other
>>>>>> CTRL_MON groups. Doesn't steal a CLOSID/RMID.
>>>>>>
>>>>>> 2) My thoughts are for a separate group that is only used to configure
>>>>>> the schemata. This does allocate a dedicated CLOSID/RMID pair. Those
>>>>>> are used for all tasks when in kernel mode.
>>>>>>
>>>>>> No context switch overhead. Has some flexibility.
>>>>>>
>>>>>> 3) Babu's RFC patch. Designates an existing CTRL_MON group as the one
>>>>>> that defines kernel CLOSID/RMID. Tasks and CPUs can be assigned to this
>>>>>> group in addition to belonging to another group than defines schemata
>>>>>> resources when running in non-kernel mode.
>>>>>> Tasks aren't required to be in the kernel group, in which case they
>>>>>> keep the same CLOSID in both user and kernel mode. When used in this
>>>>>> way there will be context switch overhead when changing between tasks
>>>>>> with different kernel CLOSID/RMID.
>>>>>>
>>>>>> 4) Even more complex scenarios with more than one user configurable
>>>>>> kernel group to give more options on resources available in the kernel.
>>>>>>
>>>>>>
>>>>>> I had a quick pass as coding my option "2". My UI to designate the
>>>>>> group to use for kernel mode is to reserve the name "kernel_group"
>>>>>> when making CTRL_MON groups. Some tweaks to avoid creating the
>>>>>> "tasks", "cpus", and "cpus_list" files (which might be done more
>>>>>> elegantly), and "mon_groups" directory in this group.
>>>>>
>>>>> Should the decision of whether context switch overhead is acceptable
>>>>> not be left up to the user?
>>>>
>>>> When someone comes up with a convincing use case to support one set of
>>>> kernel resources when interrupting task A, and a different set of
>>>> resources when interrupting task B, we should certainly listen.
>>>
>>> Absolutely. Someone can come up with such use case at any time tough. This
>>> could be, and as has happened with some other resctrl interfaces, likely will be
>>> after this feature has been supported for a few kernel versions. What timeline
>>> should we give which users to share their use cases with us? Even if we do hear
>>> from some users will that guarantee that no such use case will arise in the
>>> future? Such predictions of usage are difficult for me and I thus find it simpler
>>> to think of flexible ways to enable the features that we know the hardware supports.
>>>
>>> This does not mean that a full featured solution needs to be implemented from day 1.
>>> If folks believe there are "no valid use cases" today resctrl still needs to prepare for
>>> how it can grow to support full hardware capability and hardware designs in the
>>> future.
>>>
>>> Also, please also consider not just resources for kernel work but also monitoring for
>>> kernel work. I do think, for example, a reasonable use case may be to determine
>>> how much memory bandwidth the kernel uses on behalf of certain tasks.
>>>
>>>>> I assume that, just like what is currently done for x86's MSR_IA32_PQR_ASSOC,
>>>>> the needed registers will only be updated if there is a new CLOSID/RMID needed
>>>>> for kernel space.
>>>>
>>>> Babu's RFC does this.
>>>
>>> Right.
>>>
>>>>
>>>>> Are you suggesting that just this checking itself is too
>>>>> expensive to justify giving user space more flexibility by fully enabling what
>>>>> the hardware supports? If resctrl does draw such a line to not enable what
>>>>> hardware supports it should be well justified.
>>>>
>>>> The check is likley light weight (as long as the variables to be
>>>> compared reside in the same cache lines as the exisitng CLOSID
>>>> and RMID checks). So if there is a use case for different resources
>>>> when in kernel mode, then taking this path will be fine.
>>>
>>> Why limit this to knowing about a use case? As I understand this feature can be
>>> supported in a flexible way without introducing additional context switch overhead
>>> if the user prefers to use just one allocation for all kernel work. By being
>>> configurable and allowing resctrl to support more use cases in the future resctrl
>>> does not paint itself into a corner. This allows resctrl to grow support so that
>>> the user can use all capabilities of the hardware with understanding that it will
>>> increase context switch time.
>>>
>>> Reinette
>>
>> How about this idea for extensibility.
>>
>> Rename Babu's "plza" file to "plza_mode". Instead of just being an
>> on/off switch, it may accept multiple possible requests.
>>
>> Humorous version:
>>
>> # echo "babu" > plza_mode
>>
>> This results in behavior of Babu's RFC. The CLOSID and RMID assigned to
>> the CTRL_MON group are used when in kernel mode, but only for tasks that
>> have their task-id written to the "tasks" file or for tasks in the
>> default group in the "cpus" or "cpus_list" files are used to assign
>> CPUs to this group.
>>
>> # echo "tony" > plza_mode
>>
>> All tasks run with the CLOSID/RMID for this group. The "tasks", "cpus" and
>> "cpus_list" files and the "mon_groups" directory are removed.
>>
>> # echo "ben" > plza_mode"
>>
>> Only usable in the top-level default CTRL_MON directory. CLOSID=0/RMID=0
>> are used for all tasks in kernel mode.
>>
>> # echo "stephane" > plza_mode
>>
>> The RMID for this group is freed. All tasks run in kernel mode with the
>> CLOSID for this group, but use same RMID for both user and kernel.
>> In addition to files removed in "tony" mode, the mon_data directory is
>> removed.
>>
>> # echo "some-future-name" > plza_mode
>>
>> Somebody has a new use case. Resctrl can be extended by allowing some
>> new mode.
>>
>>
>> Likely real implementation:
>>
>> Sub-components of each of the ideas above are encoded as a bitmask that
>> is written to plza_mode. There is a file in the info/ directory listing
>> which bits are supported on the current system (e.g. the "keep the same
>> RMID" mode may be impractical on ARM, so it would not be listed as an
>> option.)
>
> I like the idea of a global file that indicates what is supported on the
> system. I find this to match Ben's proposal of a "kernel_mode" file in
> info/ that looks to be a good foundation to build on. Ben also reiterated support
> for this in
> https://lore.kernel.org/lkml/feaa16a5-765c-4c24-9e0b-c1f4ef87a66f@xxxxxxx/
>
> As I mentioned in https://lore.kernel.org/lkml/5c19536b-aca0-42ce-a9d5-211fbbdbb485@xxxxxxxxx/
> the suggestions surrounding the per-resource group "plza_mode" file
> are unexpected since they ignore earlier comments about impact on user space.
> Specifically, this proposal does not address:
> https://lore.kernel.org/lkml/aY3bvKeOcZ9yG686@xxxxxxxxxxxxxxx/
> https://lore.kernel.org/lkml/c779ce82-4d8a-4943-b7ec-643e5a345d6c@xxxxxxx/
>
> Below I aim to summarize the discussions as they relate to constraints and
> requirements. I intended to capture all that has been mentioned in these
> discussions so far so if I did miss something it was not intentional and
> please point this out to help make this summary complete.
>
> I hope by starting with this we can start with at least agreeing what
> resctrl needs to support and how user space could interact with resctrl
> to meet requirements.
>
> After the summary of what resctrl needs to support I aim to combine
> capabilities from the various proposals to meet the constraints and
> requirements as I understand them so far. This aims to build on all that
> has been shared until now.
>
> Any comments are appreciated.
>
> Summary of considerations surrounding CLOSID/RMID (PARTID/PMG) assignment for kernel work
> =========================================================================================
>
> - PLZA currently only supports global assignment (only PLZA_EN of
> MSR_IA32_PQR_PLZA_ASSOC may differ on logical processors). Even so, current
> speculation is that RMID_EN=0 implies that user space RMID is used to monitor
> kernel work that could appear to user as "kernel mode" supporting multiple RMIDs.
> https://lore.kernel.org/lkml/abb049fa-3a3d-4601-9ae3-61eeb7fd8fcf@xxxxxxx/
>
> - MPAM can set unique PARTID and PMG on every logical processor.
> https://lore.kernel.org/lkml/fd7e0779-7e29-461d-adb6-0568a81ec59e@xxxxxxx/
>
> - While current PLZA only supports global assignment it may in future generations
> not require MSR_IA32_PQR_PLZA_ASSOC to be same on logical processors. resctrl
> thus needs to be flexible here.
> https://lore.kernel.org/lkml/fa45088b-1aea-468e-8253-3238e91f76c7@xxxxxxx/
>
> - No equivalent feature on RISC-V.
> https://lore.kernel.org/lkml/aYvP98xGoKPrDBCE@gen8/
>
> - Impact on context switch delay is a concern and unnecessary context switch delay should
> be avoided.
> https://lore.kernel.org/lkml/aZThTzdxVcBkLD7P@agluck-desk3/
> https://lore.kernel.org/lkml/CABPqkBSq=cgn-am4qorA_VN0vsbpbfDePSi7gubicpROB1=djw@xxxxxxxxxxxxxx/
>
> - There is no requirement that a CLOSID/PARTID should be dedicated to kernel work.
> Specifically, same CLOSID/PARTID can be used for user space and kernel work.
> Also directly requested to not make kernel work CLOSID/PARTID exclusive:
> https://lore.kernel.org/lkml/c8268b2a-50d7-44b4-ac3f-5ce6624599b1@xxxxxxx/
>
> - Only use case presented so far is related to memory bandwidth allocation where
> all kernel work is done unthrottled or equivalent to highest priority tasks while
> monitoring remains associated to task self.
> https://lore.kernel.org/lkml/CABPqkBSq=cgn-am4qorA_VN0vsbpbfDePSi7gubicpROB1=djw@xxxxxxxxxxxxxx/
> PLZA can support this with its global allocation (assuming RMID_EN=0 associates user
> RMID with kernel work) To support this use case MPAM would need to be able to
> change both PARTID and PMG:
> https://lore.kernel.org/lkml/845587f3-4c27-46d9-83f8-6b38ccc54183@xxxxxxx/
>
> - Motivation of this work is to run kernel work with more/all/unthrottled
> resources to avoid priority inversions. We need to be careful with such
> generalization since not all resource allocations are alike yet a CLOSID/PARTID
> assignment applies to all resources. For example, user may designate a cache
> portion for high priority user space work and then needs to choose which cache
> portions the kernel may allocate into.
> https://lore.kernel.org/lkml/6293c484-ee54-46a2-b11c-e1e3c736e578@xxxxxxx/
> - If all kernel work is done using the same allocation/CLOSID/PARTID then user
> needs to decide whether the kernel work's cache allocation overlaps the high
> priority tasks or not. To avoid evicting high priority task work it may be
> simplest for kernel allocation to not overlap high priority work but kernel work
> done on behalf of high priority work would then risk eviction by low priority
> work.
> - When considering cache allocation it seems more flexible to have high priority
> work keep its cache allocation when entering the kernel? This implies more than
> one CLOSID/PARTID may need to be used for kernel work.
>
>
> TBD
> ===
> - What is impact of different controls (for example the upcoming MAX) when tasks are
> spread across multiple control groups?
> https://lore.kernel.org/lkml/aY3bvKeOcZ9yG686@xxxxxxxxxxxxxxx/
>
> How can MPAM support the "monitor kernel work with user space work" use case?
> =============================================================================
> This considers how MPAM could support the use case presented in:
> https://lore.kernel.org/lkml/CABPqkBSq=cgn-am4qorA_VN0vsbpbfDePSi7gubicpROB1=djw@xxxxxxxxxxxxxx/
>
> To support this use case in MPAM the control group that dictates the allocations
> used in kernel work has to have monitor group(s) where this usage is tracked and user
> space would need to sum the kernel and user space usage. The number of PMG may vary
> and resctrl cannot assume that the kernel control group would have sufficient monitor
> groups to map 1:1 with user space control and monitor groups. Mapping user space
> control and monitor groups to kernel monitor groups thus seems best to be done by
> user space.
>
> Some examples:
> Consider allocation and monitoring setup for user space work:
> /sys/fs/resctrl <= User space default allocations
> /sys/fs/resctrl/g1 <= User space allocations g1
> /sys/fs/resctrl/g1/mon_groups/g1m1 <= User space monitoring group g1m1
> /sys/fs/resctrl/g1/mon_groups/g1m2 <= User space monitoring group g1m2
> /sys/fs/resctrl/g2 <= User space allocations g2
> /sys/fs/resctrl/g2/mon_groups/g2m1 <= User space monitoring group g2m1
> /sys/fs/resctrl/g2/mon_groups/g2m2 <= User space monitoring group g2m2
>
> Having a single control group for kernel work and a system that supports
> 7 PMG per PARTID makes it possible to have a monitoring group for each user space
> monitoring group:
> (will go more into how such assignments can be made later)
>
> /sys/fs/resctrl/kernel <= Kernel space allocations
> /sys/fs/resctrl/kernel/mon_data <= Kernel space monitoring default group
> /sys/fs/resctrl/kernel/mon_groups/kernel_g1 <= Kernel space monitoring group g1
> /sys/fs/resctrl/kernel/mon_groups/kernel_g1m1 <= Kernel space monitoring group g1m1
> /sys/fs/resctrl/kernel/mon_groups/kernel_g1m2 <= Kernel space monitoring group g1m2
> /sys/fs/resctrl/kernel/mon_groups/kernel_g2 <= Kernel space monitoring group g2
> /sys/fs/resctrl/kernel/mon_groups/kernel_g2m1 <= Kernel space monitoring group g2m1
> /sys/fs/resctrl/kernel/mon_groups/kernel_g2m2 <= Kernel space monitoring group g2m2
>
> With a configuration as above user space can sum the monitoring events of the user space
> groups and associated kernel space groups to obtain counts of all work done on behalf of
> associated tasks.
>
> It may not be possible to have such 1:1 relationship and user space would have to
> arrange groups to match its usage. For example if system only supports two PMG per PARTID
> then user space may find it best to track monitoring as below:
> /sys/fs/resctrl/kernel <= Kernel space allocations
> /sys/fs/resctrl/kernel/mon_data <= Kernel space monitoring for all of default and g1
> /sys/fs/resctrl/kernel/mon_groups/kernel_g2 <= Kernel space monitoring for all of g2
>
>
> Requirements
> ============
> Based on understanding of what PLZA and MPAM is (and could be) capable of while considering the
> use case presented thus far it seems that resctrl has to:
> - support global assignment of resource group for kernel work
> - support per-resource group assignment for kernel work
>
> How can resctrl support the requirements?
> =========================================
>
> New global resctrl fs files
> ===========================
> info/kernel_mode (always visible)
> info/kernel_mode_assignment (visibility and content depends on active setting in info/kernel_mode)
>
> info/kernel_mode
> ================
> - Displays the currently active as well as possible features available to user
> space.
> - Single place where user can query "kernel mode" behavior and capabilities of the
> system.
> - Some possible values:
> - inherit_ctrl_and_mon <=== previously named "match_user", just renamed for consistency with other names
> When active, kernel and user space use the same CLOSID/RMID. The current status
> quo for x86.
> - global_assign_ctrl_inherit_mon
> When active, CLOSID/control group can be assigned for *all* (hence, "global")
> kernel work while all kernel work uses same RMID as user space.
> Can only be supported on architecture where CLOSID and RMID are independent.
> An arch may support this in hardware (RMID_EN=0?) or this can be done by resctrl during
> context switch if the RMID is independent and the context switches cost is
> considered "reasonable".
> This supports use case https://lore.kernel.org/lkml/CABPqkBSq=cgn-am4qorA_VN0vsbpbfDePSi7gubicpROB1=djw@xxxxxxxxxxxxxx/
> for PLZA.
> - global_assign_ctrl_assign_mon
> When active the same resource group (CLOSID and RMID) can be assigned to
> *all* kernel work. This could be any group, including the default group.
> There may not be a use case for this but it could be useful as an intemediate
> step of the mode that follow (more later).
> - per_group_assign_ctrl_assign_mon
> When active every resource group can be associated with another (or the same)
> resource group. This association maps the resource group for user space work
> to resource group for kernel work. This is similar to the "kernel_group" idea
> presented in:
> https://lore.kernel.org/lkml/aYyxAPdTFejzsE42@xxxxxxxxxxxxxxx/
> This addresses use case https://lore.kernel.org/lkml/CABPqkBSq=cgn-am4qorA_VN0vsbpbfDePSi7gubicpROB1=djw@xxxxxxxxxxxxxx/
> for MPAM.
> - Additional values can be added as new requirements arise, for example "per_task"
> assignment. Connecting visibility of info/kernel_mode_assignment to mode in
> info/kernel_mode enables resctrl to later support additional modes that may require
> different configuration files, potentially per-resource group like the "tasks_kernel"
> (or perhaps rather "kernel_mode_tasks" to have consistent prefix for this feature)
> and "cpus_kernel" ("kernel_mode_cpus"?) discussed in these threads.
>
> User can view active and supported modes:
>
> # cat info/kernel_mode
> [inherit_ctrl_and_mon]
> global_assign_ctrl_inherit_mon
> global_assign_ctrl_assign_mon
>
> User can switch modes:
> # echo global_assign_ctrl_inherit_mon > kernel_mode
> # cat kernel_mode
> inherit_ctrl_and_mon
> [global_assign_ctrl_inherit_mon]
> global_assign_ctrl_assign_mon
>
>
> info/kernel_mode_assignment
> ===========================
> - Visibility depends on active mode in info/kernel_mode.
> - Content depends on active mode in info/kernel_mode
> - Syntax to identify resource groups can use the syntax created as part of earlier ABMC work
> that supports default group https://lore.kernel.org/lkml/cover.1737577229.git.babu.moger@xxxxxxx/
> - Default CTRL_MON group and if relevant, the default MON group, can be the default
> assignment when user just changes the kernel_mode without setting the assignment.
>
> info/kernel_mode_assignment when mode is global_assign_ctrl_inherit_mon
> -----------------------------------------------------------------------
> - info/kernel_mode_assignment contains single value that is the name of the control group
> used for all kernel work.
> - CLOSID/PARTID used for kernel work is determined from the control group assigned
> - default value is default CTRL_MON group
> - no monitor group assignment, kernel work inherits user space RMID
> - syntax is
> <CTRL_MON group> with "/" meaning default.
>
> info/kernel_mode_assignment when mode is global_assign_ctrl_assign_mon
> -----------------------------------------------------------------------
> - info/kernel_mode_assignment contains single value that is the name of the resource group
> used for all kernel work.
> - Combined CLOSID/RMID or combined PARTID/PMG is set globally to be associated with all
> kernel work.
> - default value is default CTRL_MON group
> - syntax is
> <CTRL_MON group>/MON group>/ with "//" meaning default control and default monitoring group.
>
> info/kernel_mode_assignment when mode is per_group_assign_ctrl_assign_mon
> -------------------------------------------------------------------------
> - this presents the information proposed in https://lore.kernel.org/lkml/aYyxAPdTFejzsE42@xxxxxxxxxxxxxxx/
> within a single file for convenience and potential optimization when user space needs to make changes.
> Interface proposed in https://lore.kernel.org/lkml/aYyxAPdTFejzsE42@xxxxxxxxxxxxxxx/ is also an option
> and as an alternative a per-resource group "kernel_group" can be made visible when user space enables
> this mode.
> - info/kernel_mode_assignment contains a mapping of every resource group to another resource group:
> <resource group for user space work>:<resource group for kernel work>
> - all resource groups must be present in first field of this file
> - Even though this is a "per group" setting expectation is that this will set the
> kernel work CLOSID/RMID for every task. This implies that writing to this file would need
> to access the tasklist_lock that, when taking for too long, may impact other parts of system.
> See https://lore.kernel.org/lkml/CALPaoCh0SbG1+VbbgcxjubE7Cc2Pb6QqhG3NH6X=WwsNfqNjtA@xxxxxxxxxxxxxx/
>
> Scenarios supported
> ===================
>
> Default
> -------
> For x86 I understand kernel work and user work to be done with same CLOSID/RMID which
> implies that info/kernel_mode can always be visible and at least display:
> # cat info/kernel_mode
> [inherit_ctrl_and_mon]
>
> info/kernel_mode_assignment is not visible in this mode.
>
> I understand MPAM may have different defaults here so would like to understand better.
>
> Dedicated global allocations for kernel work, monitoring same for user space and kernel (PLZA)
> ----------------------------------------------------------------------------------------------
> Possible scenario with PLZA, not MPAM (see later):
> 1. Create group(s) to manage allocations associated with user space work
> and assign tasks/CPUs to these groups.
> 2. Create group to manage allocations associated with all kernel work.
> - For example,
> # mkdir /sys/fs/resctrl/unthrottled
> - No constraints from resctrl fs on interactions with files in this group. From resctrl
> fs perspective it is not "dedicated" to kernel work but just another resource group.
> User space can still assign tasks/CPUs to this group that will result in this group
> to be used for both kernel and user space control and monitoring. If user space wants
> to dedicate a group to kernel work then they should not assign tasks/CPUs to it.
> 3. Set kernel mode to global_assign_ctrl_inherit_mon:
> # echo global_assign_ctrl_inherit_mon > info/kernel_mode
> - info/kernel_mode_assignment becomes visible and contains "/" to indicate that default
> resource group is used for all kernel work
> - Sets the "global" CLOSID to be used for kernel work to 0, no setting of global RMID.
> 4. Set control group to be used for all kernel work:
> # echo unthrottled > info/kernel_mode_assignment
> - Sets the "global" CLOSID to be used for kernel work to CLOSID associated with
> CTRL_MON group named "unthrottled", no change to global RMID.
>
>
> Dedicated global allocations and monitoring for kernel work
> -----------------------------------------------------------
> - Step 1 and 2 could be the same as above.
> OR
> 2b. If there is an "unthrottled" control group that is used for both user space and kernel
> allocations a separate MON group can be used to track monitoring data for kernel work.
> - For example,
> # mkdir /sys/fs/resctrl/unthrottled <=== All high priority work, kernel and user space
> # mkdir /sys/fs/resctrl/unthrottled/mon_groups/kernel_unthrottled <= Just monitor kernel work
>
> 3. Set kernel mode to global_assign_ctrl_assign_mon:
> # echo global_assign_ctrl_assign_mon > info/kernel_mode
> - info/kernel_mode_assignment becomes visible and contains "//" - default CTRL_MON is
> used for all kernel work allocations and monitoring
> - Sets both the "global" CLOSID and RMID to be used for kernel work to 0.
> 4. Set control group to be used for all kernel work:
> # echo unthrottled/kernel_unthrottled > info/kernel_mode_assignment
> - Sets the "global" CLOSID to be used for kernel work to CLOSID associated with
> CTRL_MON group named "unthrottled" and RMID used for kernel work to RMID
> associated with child MON group within "unthrottled" group named "kernel_untrottled".
>
> Dedicated global allocations for kernel work, monitoring same for user space and kernel (MPAM)
> ----------------------------------------------------------------------------------------------
> 1. User space creates resource and monitoring groups for user tasks:
> /sys/fs/resctrl <= User space default allocations
> /sys/fs/resctrl/g1 <= User space allocations g1
> /sys/fs/resctrl/g1/mon_groups/g1m1 <= User space monitoring group g1m1
> /sys/fs/resctrl/g1/mon_groups/g1m2 <= User space monitoring group g1m2
> /sys/fs/resctrl/g2 <= User space allocations g2
> /sys/fs/resctrl/g2/mon_groups/g2m1 <= User space monitoring group g2m1
> /sys/fs/resctrl/g2/mon_groups/g2m2 <= User space monitoring group g2m2
>
> 2. User space creates resource and monitoring groups for kernel work (system has two PMG):
> /sys/fs/resctrl/kernel <= Kernel space allocations
> /sys/fs/resctrl/kernel/mon_data <= Kernel space monitoring for all of default and g1
> /sys/fs/resctrl/kernel/mon_groups/kernel_g2 <= Kernel space monitoring for all of g2
> 3. Set kernel mode to per_group_assign_ctrl_assign_mon:
> # echo per_group_assign_ctrl_assign_mon > info/kernel_mode
> - info/kernel_mode_assignment becomes visible and contains
> # cat info/kernel_mode_assignment
> //://
> g1//://
> g1/g1m1/://
> g1/g1m2/://
> g2//://
> g2/g2m1/://
> g2/g2m2/://
> - An optimization here may be to have the change to per_group_assign_ctrl_assign_mon mode be implemented
> similar to the change to global_assign_ctrl_assign_mon that initializes a global default. This can
> avoid keeping tasklist_lock for a long time to set all tasks' kernel CLOSID/RMID to default just for
> user space to likely change it.
> 4. Set groups to be used for kernel work:
> # echo '//:kernel//\ng1//:kernel//\ng1/g1m1/:kernel//\ng1/g1m2/:kernel//\ng2//:kernel/kernel_g2/\ng2/g2m1/:kernel/kernel_g2/\ng2/g2m2/:kernel/kernel_g2/\n' > info/kernel_mode_assignment

Am I right in thinking that you want this in the info directory to avoid
adding files to the CTRL_MON/MON groups?

> > The interfaces proposed aim to maintain compatibility with existing
user space tools while
> adding support for all requirements expressed thus far in an efficient way. For an existing
> user space tool there is no change in meaning of any existing file and no existing known
> resource group files are made to disappear. There is a global configuration that lets user space
> manage allocations without needing to check and configure each control group, even per-resource
> group allocations can be managed from user space with a single read/write to support
> making changes in most efficient way.
>
> What do you think?

Looks a good and well considered plan. Thank you in particular for
figuring out how MPAM fits in.

>
> Reinette
>

Thanks,

Ben