Re: [RFC PATCH 13/19] x86/resctrl: Add PLZA state tracking and context switch handling
From: Ben Horgan
Date: Thu Feb 19 2026 - 12:33:35 EST
Hi Tony,
On 2/18/26 16:44, Luck, Tony wrote:
> On Tue, Feb 17, 2026 at 03:55:44PM -0800, Reinette Chatre wrote:
>> Hi Tony,
>>
>> On 2/17/26 2:52 PM, Luck, Tony wrote:
>>> On Tue, Feb 17, 2026 at 02:37:49PM -0800, Reinette Chatre wrote:
>>>> Hi Tony,
>>>>
>>>> On 2/17/26 1:44 PM, Luck, Tony wrote:
>>>>>>>>> I'm not sure if this would happen in the real world or not.
>>>>>>>>
>>>>>>>> Ack. I would like to echo Tony's request for feedback from resctrl users
>>>>>>>> https://lore.kernel.org/lkml/aYzcpuG0PfUaTdqt@agluck-desk3/
>>>>>>>
>>>>>>> Indeed. This is all getting a bit complicated.
>>>>>>>
>>>>>>
>>>>>> ack
>>>>>
>>>>> We have several proposals so far:
>>>>>
>>>>> 1) Ben's suggestion to use the default group (either with a Babu-style
>>>>> "plza" file just in that group, or a configuration file under "info/").
>>>>>
>>>>> This is easily the simplest for implementation, but has no flexibility.
>>>>> Also requires users to move all the non-critical workloads out to other
>>>>> CTRL_MON groups. Doesn't steal a CLOSID/RMID.
>>>>>
>>>>> 2) My thoughts are for a separate group that is only used to configure
>>>>> the schemata. This does allocate a dedicated CLOSID/RMID pair. Those
>>>>> are used for all tasks when in kernel mode.
>>>>>
>>>>> No context switch overhead. Has some flexibility.
>>>>>
>>>>> 3) Babu's RFC patch. Designates an existing CTRL_MON group as the one
>>>>> that defines kernel CLOSID/RMID. Tasks and CPUs can be assigned to this
>>>>> group in addition to belonging to another group than defines schemata
>>>>> resources when running in non-kernel mode.
>>>>> Tasks aren't required to be in the kernel group, in which case they
>>>>> keep the same CLOSID in both user and kernel mode. When used in this
>>>>> way there will be context switch overhead when changing between tasks
>>>>> with different kernel CLOSID/RMID.
>>>>>
>>>>> 4) Even more complex scenarios with more than one user configurable
>>>>> kernel group to give more options on resources available in the kernel.
>>>>>
>>>>>
>>>>> I had a quick pass as coding my option "2". My UI to designate the
>>>>> group to use for kernel mode is to reserve the name "kernel_group"
>>>>> when making CTRL_MON groups. Some tweaks to avoid creating the
>>>>> "tasks", "cpus", and "cpus_list" files (which might be done more
>>>>> elegantly), and "mon_groups" directory in this group.
>>>>
>>>> Should the decision of whether context switch overhead is acceptable
>>>> not be left up to the user?
>>>
>>> When someone comes up with a convincing use case to support one set of
>>> kernel resources when interrupting task A, and a different set of
>>> resources when interrupting task B, we should certainly listen.
>>
>> Absolutely. Someone can come up with such use case at any time tough. This
>> could be, and as has happened with some other resctrl interfaces, likely will be
>> after this feature has been supported for a few kernel versions. What timeline
>> should we give which users to share their use cases with us? Even if we do hear
>> from some users will that guarantee that no such use case will arise in the
>> future? Such predictions of usage are difficult for me and I thus find it simpler
>> to think of flexible ways to enable the features that we know the hardware supports.
>>
>> This does not mean that a full featured solution needs to be implemented from day 1.
>> If folks believe there are "no valid use cases" today resctrl still needs to prepare for
>> how it can grow to support full hardware capability and hardware designs in the
>> future.
>>
>> Also, please also consider not just resources for kernel work but also monitoring for
>> kernel work. I do think, for example, a reasonable use case may be to determine
>> how much memory bandwidth the kernel uses on behalf of certain tasks.
>>
>>>> I assume that, just like what is currently done for x86's MSR_IA32_PQR_ASSOC,
>>>> the needed registers will only be updated if there is a new CLOSID/RMID needed
>>>> for kernel space.
>>>
>>> Babu's RFC does this.
>>
>> Right.
>>
>>>
>>>> Are you suggesting that just this checking itself is too
>>>> expensive to justify giving user space more flexibility by fully enabling what
>>>> the hardware supports? If resctrl does draw such a line to not enable what
>>>> hardware supports it should be well justified.
>>>
>>> The check is likley light weight (as long as the variables to be
>>> compared reside in the same cache lines as the exisitng CLOSID
>>> and RMID checks). So if there is a use case for different resources
>>> when in kernel mode, then taking this path will be fine.
>>
>> Why limit this to knowing about a use case? As I understand this feature can be
>> supported in a flexible way without introducing additional context switch overhead
>> if the user prefers to use just one allocation for all kernel work. By being
>> configurable and allowing resctrl to support more use cases in the future resctrl
>> does not paint itself into a corner. This allows resctrl to grow support so that
>> the user can use all capabilities of the hardware with understanding that it will
>> increase context switch time.
>>
>> Reinette
>
> How about this idea for extensibility.
>
> Rename Babu's "plza" file to "plza_mode". Instead of just being an
> on/off switch, it may accept multiple possible requests.
If we're making global configuration choices then I think it should be
visible in a global location. It doesn't seem good to have to check all
CTRL_MON group.
>
> Humorous version:
>
> # echo "babu" > plza_mode
>
> This results in behavior of Babu's RFC. The CLOSID and RMID assigned to
> the CTRL_MON group are used when in kernel mode, but only for tasks that
> have their task-id written to the "tasks" file or for tasks in the
> default group in the "cpus" or "cpus_list" files are used to assign
> CPUs to this group.
>
> # echo "tony" > plza_mode
>
> All tasks run with the CLOSID/RMID for this group. The "tasks", "cpus" and
> "cpus_list" files and the "mon_groups" directory are removed.
>
> # echo "ben" > plza_mode"
>
> Only usable in the top-level default CTRL_MON directory. CLOSID=0/RMID=0
> are used for all tasks in kernel mode.
>
> # echo "stephane" > plza_mode
>
> The RMID for this group is freed. All tasks run in kernel mode with the
> CLOSID for this group, but use same RMID for both user and kernel.
> In addition to files removed in "tony" mode, the mon_data directory is
> removed.
For these option with a single group set as plza we could have a global
option and then just a plza marker.
>
> # echo "some-future-name" > plza_mode
>
> Somebody has a new use case. Resctrl can be extended by allowing some
> new mode.
>
> > Likely real implementation:
>
> Sub-components of each of the ideas above are encoded as a bitmask that
> is written to plza_mode. There is a file in the info/ directory listing
> which bits are supported on the current system (e.g. the "keep the same
> RMID" mode may be impractical on ARM, so it would not be listed as an
> option.)
>
> -Tony
Thanks,
Ben