Re: [PATCH 00/15] x86/resctrl : Support AMD QoS RMID Pinning feature

From: Peter Newman
Date: Mon Dec 04 2023 - 19:14:06 EST


[+James]

Hi James,

On Thu, Nov 30, 2023 at 4:57 PM Babu Moger <babu.moger@xxxxxxx> wrote:
>
> These series adds the support for AMD QoS RMID Pinning feature. It is also
> called ABMC (Assignable Bandwidth Monitoring Counters) feature.
>
> The feature details are available in APM listed below [1].
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
> Monitoring (ABMC). The documentation is available at
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>
> The patches are based on top of commit
> 346887b65d89ae987698bc1efd8e5536bd180b3f (tip/master)
>
> # Introduction
>
> AMD hardware can support 256 or more RMIDs. However, bandwidth monitoring
> feature only guarantees that RMIDs currently assigned to a processor will
> be tracked by hardware. The counters of any other RMIDs which are no
> longer being tracked will be reset to zero. The MBM event counters return
> "Unavailable" for the RMIDs that are not active.
>
> Users can create 256 or more monitor groups. But there can be only limited
> number of groups that can be give guaranteed monitoring numbers. With ever
> changing system configuration, there is no way to definitely know which of
> these groups will be active for certain point of time. Users do not have
> the option to monitor a group or set of groups for certain period of time
> without worrying about RMID being reset in between.
>
> The ABMC feature provides an option to pin (or assign) the RMID to the
> hardware counter and monitor the bandwidth for a longer duration. The
> pinned RMID will be active until the user unpins (or unassigns) it. There
> is no need to worry about counters being reset during this period.
> Additionally, the user can specify a bitmask identifying the specific
> bandwidth types from the given source to track with the counter.
>
> # Linux Implementation
>
> Hardware provides total of 32 counters available for assignment.
> Each Linux resctrl group can be assigned a maximum of 2 counters. One for
> mbm_total_bytes and one for mbm_local_bytes. Users also have the option to
> assign only one counter to the group. If the system runs out of assignable
> counters, the kernel will display the error when the user attempts a new
> counter assignment. Users need to unassign already used counters for new
> assignments.
>
> # Examples
>
> a. Check if ABMC support is available
> #mount -t resctrl resctrl /sys/fs/resctrl/
> #cat /sys/fs/resctrl/info/L3_MON/mon_features
> llc_occupancy
> mbm_total_bytes
> mbm_total_bytes_config
> mbm_local_bytes
> mbm_local_bytes_config
> abmc_capable ← Linux kernel detected ABMC feature.
>
> b. Mount with ABMC support
> #umount /sys/fs/resctrl/
> #mount -o abmc -t resctrl resctrl /sys/fs/resctrl/
>
> c. Read the monitor states. There will be new file "monitor_state"
> for each monitor group when ABMC feature is enabled. By default,
> both total and local MBM events are in "unassign" state.
>
> #cat /sys/fs/resctrl/monitor_state
> total=unassign;local=unassign
>
> d. Read the event mbm_total_bytes and mbm_local_bytes. Note that MBA
> events are not available until the user assigns the events explicitly.
> Users need to assign the counters to monitor the events in this mode.
>
> #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> Unavailable
>
> #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
> Unavailable
>
> e. Assign a h/w counter to the total event and read the monitor_state.
>
> #echo total=assign > /sys/fs/resctrl/monitor_state
> #cat /sys/fs/resctrl/monitor_state
> total=assign;local=unassign
>
> f. Now that the total event is assigned. Read the total event.
>
> #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> 6136000
>
> g. Assign a h/w counter to both total and local events and read the monitor_state.
>
> #echo "total=assign;local=assign" > /sys/fs/resctrl/monitor_state
> #cat /sys/fs/resctrl/monitor_state
> total=assign;local=assign
>
> h. Now that both total and local events are assigned, read the events.
>
> #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> 6136000
> #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
> 58694

We had briefly discussed this topic of explicit counter assignment in
resctrl earlier this year[1], but you didn't want it to be unique to
MPAM.

Now that a similar capability exists on AMD and an interface is being
proposed, we can talk about this in the context of MPAM again.

With some generalization and refinements, I expect this proposal could
be applied to assigning a limited number of MBWU monitors to
monitoring groups.

Also, I had proposed in another thread[2] applying such an interface
to previous AMD hardware where the monitor assignments cannot be
directly controlled to avoid or reduce the overhead in my soft RMID
proposal.

Thanks!
-Peter

[1] https://lore.kernel.org/all/f8a25b5f-4a7d-0891-1152-33f349059b5d@xxxxxxx/
[2] https://lore.kernel.org/all/CALPaoCjg-W3w8OKLHP_g6Evoo03fbgaOQZrGTLX6vdSLp70=SA@xxxxxxxxxxxxxx/