Re: [PATCH v6 19/22] x86/resctrl: Introduce the interface to switch between monitor modes

From: Peter Newman
Date: Mon Aug 19 2024 - 14:27:57 EST


Hi Reinette,

On Mon, Aug 19, 2024 at 7:53 AM Reinette Chatre
<reinette.chatre@xxxxxxxxx> wrote:
>
> Hi Peter and James,
>
> On 8/16/24 11:09 AM, Reinette Chatre wrote:
> > Hi Peter,
> >
> > On 8/16/24 10:16 AM, Peter Newman wrote:
> >> Hi Reinette,
> >>
> >> On Fri, Aug 16, 2024 at 10:01 AM Reinette Chatre
> >> <reinette.chatre@xxxxxxxxx> wrote:
> >>>
> >>> Hi James,
> >>>
> >>> On 8/16/24 9:31 AM, James Morse wrote:
> >>>> Hi Babu,
> >>>>
> >>>> On 06/08/2024 23:00, Babu Moger wrote:
> >>>>> Introduce interface to switch between ABMC and legacy modes.
> >>>>>
> >>>>> By default ABMC is enabled on boot if the feature is available.
> >>>>> Provide the interface to go back to legacy mode if required.
> >>>>
> >>>> I may have missed it on an earlier version ... why would anyone want the non-ABMC
> >>>> behaviour on hardware that requires it: counters randomly reset and randomly return
> >>>> 'Unavailable'... is that actually useful?
> >>>>
> >>>> You default this to on, so there isn't a backward compatibility argument here.
> >>>>
> >>>> It seems like being able to disable this is a source of complexity - is it needed?
> >>>
> >>> The ability to go back to legacy was added while looking ahead to support the next
> >>> "assignable counter" feature that is software based ("soft-RMID" .. "soft-ABMC"?).
> >>>
> >>> This series adds support for ABMC on recent AMD hardware to address the issue described
> >>> in cover letter. This issue also exists on earlier AMD hardware that does not have the ABMC
> >>> feature and Peter is working on a software solution to address the issue on non-ABMC hardware.
> >>> This software solution is expected to have the same interface as the hardware solution but
> >>> earlier discussions revealed that it may introduce extra latency that users may only want to
> >>> accept during periods of active monitoring. Thus the option to disable the counter assignment
> >>> mode.
> >>
> >> Sorry again for the soft-RMID/soft-ABMC confusion[1], it was soft-RMID
> >> that impacted context switch latency. Soft-ABMC does not require any
> >> additional work at context switch.
> >
> > No problem. I did read [1] but I do not think I've seen soft-ABMC yet so
> > my understanding of what it does is vague.
> >
> >> The only disadvantage to soft-ABMC I can think of is that it also
> >> limits reading llc_occupancy event counts to "assigned" groups,
> >> whereas without it, llc_occupancy works reliably on all RMIDs on AMD
> >> hardware.
> >
> > hmmm ... keeping original llc_occupancy behavior does seem useful enough
> > as motivation to keep the "legacy"/"default" mbm_assign_mode? It does sound
> > to me as though soft-ABMC may not be as accurate when it comes to llc_occupancy.
> > As I understand the hardware may tag entries in cache with RMID and that has a longer
> > lifetime than the tasks that allocated that data into the cache. If soft-ABMC
> > permanently associates an RMID with a local and total counter pair but that
> > RMID is dynamically assigned to resctrl groups then a group may not always
> > get the same RMID ... and thus its llc_occupancy data would be a combination of
> > its cache allocations and all the cache allocations of resource groups that had
> > that RMID before it. This may need significantly enhanced "limbo" handling?
>

For the use case of soft-ABMC that I'm aware of, it would be better to
disable llc_occupancy events and accept it as a limitation as we're
not using this feature. I don't want to slow down the rate at which
MBM counters could be reassigned. Over the course of a multiple-second
bandwidth measurement window on a bandwidth-saturated host, a previous
group's initial cache occupancy isn't significant enough to justify a
limbo period, especially when padded out to 1 second.

I would feel differently if my users were more interested in
llc_occupancy counts and it was possible for the LLC to immediately
notify when the occupancy threshold for any of a set of groups has
been crossed.

> To expand on this we may have to rework the interface if the counters can be
> assigned to events other than MBM.
>
> James: could you please elaborate how you plan to use this feature and if this
> interface works for the planned usage?
>
> Peter: considering the previous example [1] where soft-ABMC was using the "mbm_control"
> interface I do not think it is ideal to only use the "t" and "l" flags while
> llc_occupancy is also enabled/disabled via this interface. We should consider
> (a) renaming the control file to indicate larger scope than MBM, (b) add flags
> for llc_occupancy. What do you think? I believe this is in line with stated goal
> from [1]: "I believe mbm_control should always accurately reflect which events
> are being counted."

I should have said, "I believe mbm_control should always accurately
reflect which _MBM_ events are being counted."

In general, MBM requires maintaining cumulative, running counts, while
llc_occupancy is only a snapshot of cache usage. This is why MBM
results in contended resources (counters) which must be managed by the
user. In the MPAM implementations I've seen so far, a small number
(relative to the number of monitoring groups supported) of occupancy
monitors is sufficient for a large number of groups, because it only
limits the number of monitoring groups' occupancy counts which can be
read in parallel and can be adequately managed within the MPAM driver
without user interaction.

Because of this, broadening the scope of mbm_control to include
occupancy would only serve to remind the user whether occupancy is
supported, but would provide no new information beyond what's already
provided by mon_features.

-Peter