Re: [RFD] resctrl: reassigning a running container's CTRL_MON group
From: Peter Newman
Date: Thu Oct 20 2022 - 04:48:40 EST
On Thu, Oct 20, 2022 at 1:54 AM Reinette Chatre
<reinette.chatre@xxxxxxxxx> wrote:
> It is still not clear to me how palatable this will be on Arm systems.
> This solution also involves changing the CLOSID/PARTID like your original
> proposal and James highlighted that it would "mess up the bandwidth counters"
> because of the way PARTID.PMG is used for monitoring. Perhaps even a new
> PMG would need to be assigned during such a monitor group move. One requirement
> for this RFD was to keep usage counts intact and from what I understand
> this will not be possible on Arm systems. There could be software mechanisms
> to help reduce the noise during the transition. For example, some new limbo
> mechanism that avoids re-assigning the old PARTID.PMG, while perhaps still
> using the old PARTID.PMG to read usage counts for a while? Or would the
> guidance just be that the counters will have some noise after the move?
I'm going to have to follow up on the details of this in James's thread.
It sounded like we probably won't be able to create enough mon_groups
under a single control group for the rename feature to even be useful.
Rather, we expect the PARTID counts to be so much larger than the PMG
counts that creating more mon_groups to reduce the number of control
groups wouldn't make sense.
At least in our use case, we're literally creating "classes of service"
to prioritize memory traffic, so we want a small number of control
groups to represent the small number of priority levels, but enough
RMIDs to count every job's traffic independently. For MPAM to support
this MBM/MBA use case in exactly this fashion, we'd have to develop the
monitors-not-matching-on-PARTID use case better in the MPAM
architecture. But before putting much effort into that, I'd want to know
if there's any payoff beyond being able to use resctrl the same way on
both implementations.
-Peter