Re: [RFD] resctrl: reassigning a running container's CTRL_MON group

From: Reinette Chatre
Date: Wed Oct 19 2022 - 19:55:01 EST


Hi Peter,

On 10/19/2022 2:08 AM, Peter Newman wrote:
> Hi Reinette,
>
> On Wed, Oct 12, 2022 at 7:23 PM Reinette Chatre
> <reinette.chatre@xxxxxxxxx> wrote:
>> What if resctrl adds support to rdtgroup_kf_syscall_ops for
>> the .rename callback?
>>
>> It seems like doing so could enable users to do something like:
>> mv /sys/fs/resctrl/groupA/mon_groups/containerA /sys/fs/resctrl/groupB/mon_groups/
>>
>> Such a user request would trigger the "containerA" monitor group
>> to be moved to another control group. All tasks within it could be moved to
>> the new control group (their CLOSIDs are changed) while their RMIDs
>> remain intact.
>
> I think this will be the best approach for us, since we need separate
> counters for every job. Unless you were planning to implement this very
> soon, I will prototype it for the container manager team to try out and
> submit patches for review if it works for them.

I do not have plans for work in this area.

It is still not clear to me how palatable this will be on Arm systems.
This solution also involves changing the CLOSID/PARTID like your original
proposal and James highlighted that it would "mess up the bandwidth counters"
because of the way PARTID.PMG is used for monitoring. Perhaps even a new
PMG would need to be assigned during such a monitor group move. One requirement
for this RFD was to keep usage counts intact and from what I understand
this will not be possible on Arm systems. There could be software mechanisms
to help reduce the noise during the transition. For example, some new limbo
mechanism that avoids re-assigning the old PARTID.PMG, while perhaps still
using the old PARTID.PMG to read usage counts for a while? Or would the
guidance just be that the counters will have some noise after the move?

Reinette